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UTROPHIN GENE EXPRESSION 
The presenc invention generally relates to the 
provision of nucleic acid from which a polypeptide with 
ucrophin function can be expressed, especially mini- 
5 genes and chimaeric constructs. Expression of a 

utrophin transgene significantly decreases the severity 
of the dystrophic muscle phenotype in an animal model. 

The severe muscle wasting disorders, Duchenne 
muscular dystrophy (DMD) and the less debilitating 
10 Becker muscular dystrophy (BMD) are due to mutations in; 
the dystrophin gene. Dystrophin is a large 
cytoskeletal protein which in muscle is located at the 
cytoplasmic surface of the sarcolemma, the 
neuromuscular junction (NMJ) and myotendinous junction 
15 (MTJ) . The protein is composed of four domains: an 

actin-binding domain (shown in vitro to bind actin) , a 
rod domain containing triple helical repeats, a 
cysteine rich (CR) domain and a carboxy- terminal (CT) 
domain. The majority of the CRCT binds to a complex of 
2 0 proteins and glycoproteins (called the dystrophin 

protein complex, DPC) spanning the sarcolemma. This 
complex consists of cytoskeletal syntrophins and 
dystrobrevin , transmembrane, /3-dystroglycan , a- , (2-6- ,y- 
sarcoglycans and extracellular a-dystroglycan . The DPC 
25 links to laminin-a2 (merosin) in the extracellular 
matrix and to the actin cytoskeleton via dystrophin 
within the cell. The breakdown of the integrity of the 
DPC due to the loss of, or impairment of dystrophin 
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function, leads to muscle degeneration and the DMD 
phenotype . The structure of dystrophin and protein 
interactions within the DPC have been recently reviewed 
[1,2,3] . 

5 There are various approaches which can be adopted 

for the gene therapy of DMD. These include myoblast 
transfer, retroviral infection, adenoviral infection 
and direct injection of plasmid DNA. In most cases the 
dystrophin gene used in the experiments generates a 

10 truncated protein approximately half the size of the 
full size protein. This dystrophin minigene was 
modelled on a natural mutation identified in a very 
mild Becker patient [4] . The cloned version of this 
truncated minigene is able to reverse the pathological 

15 phenotype in the dystrophin deficient mdx mouse [5,6,7) 
and has had limited success when delivered to mdx 
muscle by viral vectors [8,9,10] . Although some 
progress is being made in each of these areas using the 
mdx mouse as a model system, there are problems related 

2 0 to the number of muscle cells that can be made 

dystrophin positive, the levels of expression of the 
gene and the duration of expression [11] . Another 
problem to be addressed is the rejection of cells 
expressing dystrophin because of immunological 

25 intolerance i.e. dystrophin within these cells will 
appear foreign to the host immune system given that 
most DMD patients will never have expressed dystrophin 
[12, 13] . 
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In order to circumvent some of these problems, 
possibilities of compensating for dystrophin loss using 
a related protein, utrophin, are being explored. 

Utrophin is a 3 95kDa protein encoded by a gene 
5 located on chromosome 6q24 and shown to have strong 
sequence similarity to dystrophin [14] . The actin 
binding domain of dystrophin and utrophin has 85% 
similarity and the DPC binding region has 88% 
similarity. Both of these domains have been shown to 
.0 function as predicted in vitro. The structure and 

potential procein interactions are described in detail 
in reviews [1,2,3] . 

There is a substantial body of evidence 
demonstrating that utrophin is capable of localising to 
.5 the sarcolemma. During normal fetal muscle development 
there is increased utrophin expression, localised to 
the sarcolemma up until 18 weeks and 2 0 days gestation 
in human and mouse respectively. After this time the 
utrophin sarcolemmal staining steadily decreases to the 
0 significantly lower adult levels shortly before birth 
where utrophin is localised almost exclusively to the 
NMJ and MTJ [15,16,17] . The decrease in. utrophin 
expression coincides with increased expression of 
dystrophin [17] . Many studies have shown that utrophin 
5 is bound to the sarcolemma in DMD and BMD patients. 
However the levels of utrophin localised at the 
sarcolemma vary from report to report [18,19,20,21]. 
In some other non Xp21 myopathies, utrophin and 
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dystrophin are simultaneously bound to the sarcolemma 
of adult skeletal muscle [22] . 

High levels of utrophin may protect muscle from 
the consequences of dystrophin loss. Matsumara et al . 
5 [23] demonstrated that purified membranes from the mdx 
mouse contained complexes of utrophin and the DPC. 
When quadricep muscles (which show necrosis) from these 
mice were analysed by immunoblotting, the level of 
utrophin remained approximately the same, however the 

10 level of the a-dystroglycan was drastically reduced. 

In cardiac muscle (which shows no pathology) the level 
of utrophin was elevated four fold with no loss of the 
a-dystroglycan . Immunocytochemical analysis of other 
mdx small calibre skeletal muscles (extraocular and 

15 toe) which also have no pathology shows increased 

utrophin expression and normal levels of a-sarcoglycan . 
This result suggests that the increased levels of 
utrophin interacts with the DPC (or an antigenically 
related complex) at the sarcolemma and prevents loss of 

20 the complex thus the structure of these cells remains 
normal. In the mdx mouse, utrophin levels in muscle 
remain elevated soon after birth compared with normal 
mice; however once the utrophin levels have decreased 
to the adult levels (about 1 week after birth) , the 

25 first signs of muscle fibre necrosis are detected 
[15,16] . 

Thus, in certain circumstances utrophin can 
localise to the sarcolemma probably at the same binding 

3NSDOCID: <WO 97226S6A1_L> 
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sites as dystrophin, namely actin and the DPC . If the 
expression of utrophin is high enough, it may maintain 
the DPC and thus alleviate the DMD phenotype . It is 
unlikely that such external upregulation could be 
5 tightly controlled giving rise to potentially high 

levels of utrophin within the cell. However, this may 
not be a problem as Cox et al . [24] have demonstrated 
that gross over expression of dystrophin in the muscle 
of transgenic mdx mice reverts the muscle pathology to 
10 normal with no obvious detrimental side effects. 

The present invention has arisen from cloning of 
nucleic acid encoding utrophin and fragments of 
utrophin from various species . The original aim was to 
clone nucleic acid encoding human utrophin, but major 
15 problems were encountered. A previous paper (14) 

reported the amino acid sequence of utrophin (so-called 
"dystrophin-related protein"), obtained by cloning of 
overlapping cDNAs . However, two regions around the 
amino terminal actin binding domain were not 
2 0 represented in these clones. These regions could be 
amplified by PCR and sequenced, but it has proved not 
to be possible to clone them. Either clones which 
should have included these regions were rearranged (as 
determined by restriction mapping) or simply no clones 
25 were isolated even if highly recombination deficient E . 
coli host strains (SURE and STBL2) were used. The gaps 
in the sequence were identified by comparing the 
sequence generated from the utrophin cDNAs to the 
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published human dystrophin sequence. It became 
apparent as further utrophin clones were isolated, none 
spanned these two gaps. 

Sequence information obtained from the amino 
5 terminus of the human cDNA was used to design probes 
and rat and mouse cDNA libraries were screened. Rat 
cDNAs were also unstable or rearranged in the region 
corresponding to the unclonable regions in the human 
sequence. Some large rat clones covering these regions 

10 were obtained, but all attempts to generate subclones 
failed due to rearrangements of the inserts as 
determined by restriction mapping. Surprisingly, in 
view of the difficulties with the human and rat 
sequences, cDNA from the mouse library, covering the 

15 regions in question, was found to be stable and 
amenable to further manipulation including the 
generation of smaller subclones. 

Figure 1 shows a comparison between human, rat 
and mouse utrophin nucleotide sequences encoding part 

2 0 of the amino- terminal portion of the respective 

proteins. The unclonable regions of the human gene are 
underlined . 

This cloning work enables for the first time the 
construction of a nucleic acid molecule from which a 
25 polypeptide with utrophin function can be expressed. 

Furthermore, by way of analogy with the success 
achieved with a dystrophin mini -gene (from which a 
truncated version of dystrophin is expressed) the 
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present invention provides "utrophin mini -genes" and 
polypeptides encoded thereby. To overcome the problem 
of unclonability of regions of the human utrophin gene ■ 
sequence, the present inventors have realised that it 
5 is possible to employ a sequence of nucleotides derived 
from the mouse utrophin gene in a chimaeric construct 
to provide for expression of a polypeptide with 
utrophin function. 

According to a first aspect of the present 
10 invention there is provided a nucleic acid molecule 
comprising a sequence of nucleotides encoding a 
polypeptide with utrophin function. 

A polypeptide with utrophin function is able to 
bind actin and able to bind the dystrophin protein 
15 complex (DPC) . 

Polypeptides with utrophin function are generally 
distinguishable immunologically from dystrophin 
polypeptides. For example, they may comprise at least 
one epitope not found in dystrophin. Polypeptides with 
20 utrophin function may be identified using specific 

polyclonal or monoclonal antibodies which do not cross - 
react with dystrophin. If a polypeptide is able to 
bind actin and able to bind the dystrophin protein 
complex and at least one antibody can bind it which 
25 cannot bind dystrophin, then the polypeptide has 

utrophin function. In a preferred embodiment, the 
polypeptide can be bound by an antibody which binds 
utrophin but not dystrophin, in other words the 
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polypeptide shares at least one epitope with utrophin 
which epitope is not found in dystrophin. In another 
embodiment, the polypeptide does not contain an epitope 
found in dystrophin, such that the polypeptide is not 
5 bound by an antibody which binds dystrophin. In such a 
case, the epitope recognised by the antibody which 
binds dystrophin may be one not found in utrophin. The 
polypeptide may contain no epitope found in dystrophin. 
The immunological comparison may be made with human 

10 utrophin and/or dystrophin, especially if the 

polypeptide with utrophin function is intended for 
human use, or with the utrophin and/or dystrophin of 
the species in which use is intended, e.g. mouse. 
Mouse monoclonal antibodies MANCH07 and MANNUT1 [31] 

15 were used in the work described herein. Standard in 
vitro binding assays may be used to assess 
immunological cross-reactivity of a polypeptide. 

Thus, the polypeptide comprises an act in-binding 
domain and a dystrophin protein complex (DPC) -binding 

2 0 domain and utrophin- like as opposed to dystrophin- like , 
e.g. as determined immunologically. 

Preferably the encoding sequence comprises a 
human sequence, i.e. a sequence obtainable from the 
genome of a human cell . 

25 Comparison of various amino acid sequences 

reveals the following % similarities (calculated using 

the method of Needleman and Wunsch (1974) J. Mol . 

Biol. 48: 443-453, performed using the GAP program from 

aNSDOCID: <WO 9722696A1J_> 
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the Winsconsin Package v8, Genetics Computer Group, 575 
Science Drive, Madison, Wisconsin 53711, USA) and 
identities : 

- full length human dystrophin v. human utrophin 
5 69% similarity, 50.7% identity; 

- full length human utrophin v. rat utrophin 

93.2% similarity, 87.1% identity; 

- full length human dystrophin v. mouse dystrophin 

95.4% similarity, 91.2% identity; 

10 - human dystrophin C-terminus v. human utrophin C- 
terminus 84.1% similarity, 73.6% identity. 

As noted, the present invention is only concerned 
with "utrophin- like" molecules, not "dystrophin- like" 
molecules. Thus, polypeptides according to the present 

15 invention (e.g. as encoded by nucleic acid according to 
the invention) may have an amino acid sequence which is 
greater than about 75% similar, preferably greater than 
about 80%, about 85%, about 90%, about 95% or about 98% 
similarity to the amino acid sequence of Figure 3 or 

2 0 the amino acid sequence of Figure 9, taken over the 

full length. The polypeptides may have an amino acid 
identity of greater than about 55% identity, preferably 
greater than about 60% identity, about 70%, about 80%, 
about 90%, about 95% or about 98% identity over the 

2 5 full-length. The levels of similarity and/or identity 
may be lower cutside the C- terminal, DPC- binding domain 
provided the DPC-binding domain has greater than about 
85% similarity, preferably greater than about 90%, 
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about 95% or about 98% similarity with the DPC- binding 
domain amino acid sequence of Figure 3 or Figure 9 , or 
has greater than about 80%, preferably greater than 
about 85%, about 90%, about 95% or about 98% identity 
5 with the DPC-binding domain amino acid sequence of 

Figure 3 or Figure 9. Particular amino acid sequence 
variants or derivatives may have a sequence which 
differs from the sequence of Figure 3 or Figure 9 by 
one or more of insertion, addition, substitution or 

10 deletion of 1 amino acid, 2, 3, 4, 5-10, 10-20 20-30, 
30-50, 50-100, 100-150, or more than 150 amino acids. 

The nucleic acid molecule may be an isolate, or 
in an isolated and/or purified form, that is to say not 
in an environment in which it is found in nature, 

15 removed from its natural environment . It may be free 
from other nucleic acid obtainable from the same 
species, e.g. encoding another polypeptide. 

The nucleic acid molecule may be one which is not 
found in nature. For example, the sequence of 

20 nucleotides may form part of a cloning vector and/or an 
expression vector, as discussed further below. The 
sequence of nucleotides may represent a variant or 
derivative of a naturally occurring sequence by virtue 
of comprising an addition, insertion, deletion and/or 

25 substitution of one or more nucleotides with respect to 
the natural sequence, provided preferably that the 
encoded polypeptide has the specified characteristics. 
The addition, insertion, deletion and/or substitution 
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of one or more nucleotides may or may not be reflected 
in an alteration in the encoded amino acid sequence, 
depending on the genetic code. 

Preferably, the nucleic acid molecule is a "mini- 
5 gene", i.e. the polypeptide encoded does not correspond 
to full-length utrophin but is rather shorter, a 
truncated version. For instance, part or all of the 
rod domain may be missing, such that the polypeptide 
comprises an actin-binding domain and a DPC-binding 

10 domain but is shorter than naturally occurring 

utrophin. In a full-length utrophin gene, the actin- 
binding domain is encoded by nucleotides 1-739, while 
the DPC-binding domain (CRCT) is encoded by nucleotides 
8499-10301 (where 1 represents the start of 

15 translation; Figure 2A) . The respective domains in the 
polypeptide encoded by a mini-gene according to the 
invention may comprise amino acids corresponding to 
those encoded by these nucleotides in the full-length 
coding sequence. 

2 0 Dystrophin mini -genes have been shown to be 

active in animal models (as discussed) . Advantages of 
a mini -gene over a sequence encoding a full-length 
utrophin molecule or derivative thereof include easier 
manipulation and inclusion in vectors, such as 

2 5 adenoviral and retroviral vectors for delivery and 
expression . 

A further preferred non-naturally occurring 
molecule encoding a polypeptide with the specified 
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characteristics is a chimaeric construct wherein the 
encoding sequence comprises a sequence obtainable from 
one mammal, preferably human ("a human sequence"), and 
a sequence obtainable from another mammal, preferably 
5 mouse ("a mouse sequence") . Such a chimaeric construct 
may of course comprise the addition, insertion, 
substitution and/or deletion of one or more nucleotides 
with respect to the parent mammalian sequences from 
which it is derived. Preferably, the part of the 

10 coding sequence which encodes the actin-binding domain 
comprises a sequence of nucleotides obtainable from the 
mouse, or other non- human mammal, or a sequence of 
nucleotides derived from a sequence obtainable from the 
mouse, or other non-human mammal. 

15 In a preferred embodiment, the sequence of 

nucleotides encoding the polypeptide comprises sequence 
GAGGCAC at residues 331-337 and/or the sequence 
GATTGTGGATGAAAACAGTGGG at residues 1453-1475 (using' the 
conventional numbering- from the initiation codon ATG) , 

20 and a sequence obtainable from a human. 

The nucleic acid molecule may comprise a 
nucleotide sequence encoding a sequence of amino acids 
shown in Figure 1. As discussed, the encoding sequence 
may be chimaeric, i.e. comprise sequences of 

25 nucleotides from different species, e.g. a sequence 

from or derivable from a human and a sequence from or 
derivable from a mouse or other non-human mammal. 

A chimaeric mini -gene encoding sequence according 

BNSDOCIO: <WO 9722696A1_L> 
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to the present invention is shown in Figure 3 . 
Preferred embodiments of the present invention include 
a nucleic acid molecule comprising a sequence of 
nucleotides encoding a polypeptide which has an actin- 
5 binding domain and a DPC-binding domain and which 

polypeptide comprises an amino acid sequence encoded by 
a sequence of nucleotides shown in Figure 3, a nucleic 
acid molecule comprising a sequence of nucleotides 
encoding a variant, allele- or derivative of such a 
10 polypeptide by way of addition, substitution, insertion 
and/or deletion of one or more amino acids, and a 
nucleic acid molecule comprising a sequence of 
nucleotides which is a variant, allele or derivative of 
the sequence shown in Figure 3, by way of addition, 
15 substitution, insertion and/or deletion of one or more 
nucleotides, with or without a change in, the encoded 
amino acid sequence with respect to the amino acid 
sequence encoded by a sequence of nucleotides shown in 
Figure 3. The proviso is that the encoded polypeptide 
20 is "utrophin-like" rather than "dystrophin- like" , e.g. 
as determined immunologically as discussed. 

One particular variant or derivative of the 
sequence of Figure 3 has a sequence as shown in Figure 
9, which is a "full-length" utrophin construct, 
25 including rod domain sequences not included in the 
mini-gene of Figure 3. 

The sequences of Figure 3 and Figure 9 include 
some positions at which the precise residue is left 
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open (marked by "N" in the nucleotide sequence and "X" 
in the amino acid sequence) . Comparison of the human, 
mouse and rat utrophin sequences in this region (Figure 
10) shows that the human and rat amino acid sequences 
5 are absolutely conserved here. Accordingly, the twelve 
"X's" in Figures 3 and 9 may represent the amino acid 
sequence DKKS I IMYLTSL . Instead, in accordance with the 
discussion of variants and derivatives herein, a 
polypeptide according to the invention (as encoded by 

10 nucleic acid according to the invention) may include a 
variant or derivative sequence, by way of one or more 
of insertion, addition, substitution or deletion of one 
or more amino acids of the sequence DKKS I IMYLTSL, in 
the position marked by the X's in Figures 3 and 9. 

15 Nucleic acid according to the present invention 

is obtainable by hybridising nucleic acid of target 
cells (e.g. human, mouse, rat) with one or more oligo- 
or poly-nucleotides with sequences designed based on 
the sequence information presented in Figure 1, Figure 

20 3 or Figure 9. Thus, the full mouse sequence, or the 
sequence in the region marked by the X's in Figures 3 
and 9, may be obtained by probing or PCR using sequence 
information provided herein (e.g. Figure 1). 

Nucleic acid according to the present invention 

25 is obtainable using one or more oligonucleotide probes 
or primers designed to hybridise with one or more 
fragments of a nucleic acid sequence shown in Figure 1, 
Figure 3 or Figure 9, particularly fragments of 

BNS0OCI0: <WO 9722696A1_I_> 
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relatively rare sequence, based on codon usage or 
statistical analysis. The amino acid sequence 
information provided may be used in design of 
degenerate probes/primers or "long" probes. A primer 
5 designed to hybridise with a fragment of the nucleic 

acid sequence shown may be used in conjunction with one 
or more oligonucleotides designed to hybridise to a 
sequence in a cloning vector within which target 
nucleic acid has been cloned, or in so-called "RACE" 

10 (rapid amplification of cDNA ends) in which cDNA's in a 
library are ligated to an oligonucleotide linker and 
PCR is performed using a primer which hybridises with 
the sequence shown in the figure and a primer which 
hybridises to the oligonucleotide linker. 

15 Nucleic acid isolated and/or purified from one or 

more cells (e.g. human, mouse) or a nucleic acid 
library derived from nucleic acid • isolated and/or 
purified from cells (e.g. a cDNA library derived from 
mRNA isolated from the cells), may be probed. under 

2 0 conditions for selective hybridisation and/or subjected 
to a specific nucleic acid amplification reaction such 
as the polymerase chain reaction (PCR) . 

A method may include hybridisation of one or more 
(e.g. two) probes or primers to target nucleic acid. 

25 Where the nucleic acid is double- stranded DNA, 
hybridisation will generally be preceded by 
denaturation to produce single-stranded DNA. The 
hybridisation may be as part of a PCR procedure, or as 
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part of a probing procedure not involving PCR. An 
example procedure would be a combination of PCR and low 
stringency hybridisation. A screening procedure, 
chosen from the many available to those skilled in the 
5 art, is used to identify successful hybridisation 
events and isolated hybridised nucleic acid. 

Probing may employ the standard Southern blotting 
technique. For instance DNA may be extracted from 
cells and digested with different restriction enzymes. 

10 Restriction fragments may then be separated by 

electrophoresis on an agarose gel, before denaturation 
and transfer to a nitrocellulose filter. Labelled 
probe may be hybridised to the DNA fragments on the 
filter and binding determined. DNA for probing may be 

15 prepared from RNA preparations from cells . 

Preliminary experiments may be performed by 
hybridising under low stringency conditions various 
probes to Southern blots of DNA digested with 
restriction enzymes. Suitable conditions would be 

20 achieved when a large number of hybridising fragments 
were obtained while the background hybridisation was 
low. Using these conditions nucleic acid libraries, 
e.g. cDNA libraries representative of expressed 
sequences, may be searched. 

25 It may be necessary for one or more gene 

fragments to be ligated to generate a full-length 
coding sequence. Also, where a full-length encoding 
nucleic acid molecule has not been obtained, a smaller 
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molecule representing part of the full molecule, may be 
used to obtain full-length clones. Inserts may be 
prepared from partial cDNA clones and used to screen 
cDNA libraries. 
5 Those skilled in the art are well able to employ 

suitable conditions of the desired stringency for 
selective hybridisation, taking into account factors 
such as oligonucleotide length and base composition, 
temperature and so on. 

10 Systems for cloning and expression of a 

polypeptide in a variety of different host cells are 
well known. Suitable host cells include bacteria, 
mammalian cells, yeast and baculovirus systems. 
Mammalian cell lines available in the art for 

15 expression of a heterologous polypeptide include 

Chinese hamster ovary cells, HeLa cells, baby hamster 
kidney cells and many others. A common, preferred 
bacterial host is E. coli. 

Nucleic acid according to the present invention 

20 may form part of a cloning vector and/or a vector from 
which the encoded polypeptide may be expressed. 
Suitable vectors can be chosen or constructed, 
containing appropriate and appropriately positioned 
regulatory sequences, including promoter sequences, 

25 terminator fragments, polyadenylation sequences, 

enhancer sequences, marker genes and other sequences as 
appropriate. Vectors may be plasmids, viral e.g. 
'phage, or phagemid, as appropriate. For further 
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details see, for example, Molecular Cloning: a 
Laboratory Manual: 2nd edition, Sambrook et al. , 1989, 
Cold Spring Harbor Laboratory Press. Many known 
techniques and protocols for manipulation of nucleic 
5 acid, for example in preparation of nucleic acid 

constructs, mutagenesis, sequencing, introduction of 
DNA into cells and gene expression, and analysis of 
proteins, are described in detail in Short Protocols in 
Molecular Biology, Second Edition, Ausubel et al . eds . , 
10 John Wiley &. Sons, 1992. The disclosures of Sambrook 
et al . and Ausubel et al . are incorporated herein by 
reference . 

Thus, a further aspect of the present invention 
provides a host cell containing nucleic acid as 

15 disclosed herein. A still further aspect provides a 

method comprising introducing such nucleic acid into a 
host cell. The introduction may' employ any available 
technique. For eukaryotic cells, suitable techniques 
may include calcium phosphate transfection, DEAE- 

20 Dextran, electroporat ion , liposome-mediated 

transfection and transduction using retrovirus or other 
virus, e.g. vaccinia or, for insect cells, baculovirus . 
For bacterial cells, suitable techniques may include 
calcium chloride transformation, electroporation and 

25 transfection using bacteriophage. 

The introduction may be followed by causing or 
allowing expression from the nucleic acid, e.g. by 
culturing host cells under conditions for expression of 
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the gene. 

In one embodiment, the nucleic acid of the 
invention is integrated into the genome (e.g. 
chromosome) of the host cell. Integration may be 
5 promoted by inclusion of sequences which promote 
recombination with the genome, in accordance with 
standard techniques. 

The invention also provides a mammal , such as a 
human, primate or rodent, preferably rat or mouse, 

10 comprising a host cell as provided, and methods of 

production and use of such a mammal . The mammal may be 
non-human. Transgenic animals, particularly mice, can 
be generated using any available technique. 
Particularly suitable for purposes of study are radx 

15 mice or others with a dystrophic phenotype . 

The polypeptide encoded by the nucleic acid may 
be expressed from the nucleic acid in vitro, e.g. in a 
cell -free system or in cultured cells, or in vivo. In 
vitro expression may be useful in determining ability 

20 of the polypeptide to bind to actin and/or DPC. This 
may be useful in testing or screening for substances 
able to modulate one or both of these binding 
activities. In particular, substances able to increase 
actin and/or DPC binding of the polypeptide will add to 

25 the repertoire of molecules available for potential 
pharmaceutical/therapeutic exploitation. Such 
substances, identified as modulators of one or both of 
the binding activities of the polypeptide, following 
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expression of the polypeptide from encoding nucleic 
acid therefor, may be investigated further and may be 
manufactured and/or used in preparation of a 
medicament, pharmaceutical composition or drug which 
5 may subsequently be administered to an individual . In 
vivo expression is discussed further below. 

According to a further aspect of the present 
invention there is provided a polypeptide with utrophin 
function {other than utrophin itself) . Such a 

10 polypeptide comprises an actin-binding domain and a 

DPC-binding domain and is immunologically recognisable 
as utrophin- like rather than dystrophin- like , as 
discussed, not-being a naturally occurring polypeptide. 
The polypeptide may be any of those discussed above as 

15 being encoded by nucleic acid according to the present 
invention. In particular, the polypeptide may be 
shorter than naturally occurring full-length utrophin, 
for example by virtue of lacking all or part of the rod 
domain. The actin-binding and DPC-binding domains may 

2 0 correspond to those of human, mouse or other non- human 
utrophin or may be derived therefrom by way of 
addition, substitution, insertion and/or deletion of 
one or more amino acids. The polypeptide may be 
chimaeric, comprising sequences of amino acids from or 

2 5 derived from different species, e.g. human and mouse, 
as discussed. 

A convenient way of producing a polypeptide 
according to the present invention is to express 
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nucleic acid encoding it . Accordingly, methods of 
making such polypeptides by expression from encoding 
nucleic acid therefor are provided by the present 
invention, in vitro, e.g. in cell-free systems or by 
5 culturing host cells under appropriate conditions for 
expression, or in vivo. 

Polypeptides and nucleic acid according to the 
invention may be used in the manufacture of 
medicaments, compositions, including pharmaceutical 
10 formulations, and drugs for delivery to an individual, . 
e.g. a human with muscular dystrophy or a non-human 
mammal, such as a mouse, as a model for study of the 
polypeptides, muscular dystrophy and therapy thereof. 

For example, a method of treatment practised on 
15 the human or animal body in accordance with the present 
invention may comprise administration to an individual 
of nucleic acid encoding a polypeptide as disclosed 
herein. The nucleic acid may form part of a construct 
enabling expression within cells of the individual. 
20 Nucleic acid may be introduced into cells using a 
retroviral vector, preferably one which will not 
transform cells, or using liposome technology. 

Administration is preferably in a 
"therapeutically effective amount", this being 
25 sufficient to show benefit to a patient. Such benefit 
may be at least amelioration of at least one symptom. 
The actual amount administered, and rate and time- 
course of administration, will depend on the nature and 
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severity of what is being treated. Prescription of 
treatment, eg decisions on dosage etc, is within the 
responsibility of general practitioners and other 
medical doctors. 
5 A composition may be administered alone or in 

combination with other treatments, either 
simultaneously or sequentially dependent upon the 
condition to be treated. 

Pharmaceutical compositions according to the 

10 present invention, and for use in accordance with the 
present invention, may comprise, in addition to active 
ingredient, a pharmaceutically acceptable excipient, 
carrier, buffer, stabiliser or other materials well 
known to those skilled in the art . Such materials 

15 should be non- toxic and should not interfere with the 
efficacy of the active ingredient. The precise nature 
of the carrier or other material will depend on the 
route of administration, which may be oral, or by 
injection, e.g. cutaneous, subcutaneous or intravenous. 

20 Pharmaceutical compositions for oral 

administration may be in tablet, capsule, powder or 
liquid form. A tablet may comprise a solid carrier 
such as gelatin or an adjuvant. Liquid pharmaceutical 
compositions generally comprise a liquid carrier such 

25 as water, petroleum, animal or vegetable oils, mineral 
oil or synthetic oil. Physiological saline solution, 
dextrose or other saccharide solution or glycols such 
as ethylene glycol, propylene glycol or polyethylene 
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glycol may be included. 

For intravenous, cutaneous or subcutaneous 
injection, or injection at the site of affliction, the 
active ingredient may be in the form of a parenterally 
5 acceptable aqueous solution which is pyrogen- free and 
has suitable pH, isotonicity and stability. Those of 
relevant skill in the art are well able to prepare 
suitable solutions using, for example, isotonic 
vehicles such as Sodium Chloride Injection, Ringer's 
10 Injection, Lactated Ringer's Injection. Preservatives, 
stabilisers, buffers, antioxidants and/or other 
additives may be included, as required. 

Injection may be used to deliver nucleic acid to 
disease sites. Internally, suitable imaging devices 
15 may be employed to guide an injecting needle to the 
desired site. 

It may be desirable to remove cells from the 
body, treat them then return them to the body, or to 
administer cells derived from cells removed from an 
20 individual. This might be appropriate, for example, if 
muscle stem cells can be isolated. Muscle precursor 
cells ("rape") have been used in cell therapy in mdx 
mice, where implantation of normal mpc gave rise to 
substantial amounts of dystrophin [25,26,27]. 
25 Immunosuppression increases success of cell 

implantation procedures [13] . Myoblasts may be used to 
introduce genes into muscle fibres during growth or 
repair, as has been demonstrated using a replication - 
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defective retroviral vector to introduce a mini- 
dystrophin construct into proliferating myogenic cells 
in tissue culture [28] . 

Thus, cells in culture may have nucleic acid 
5 according to the present invention introduced into them 
before the cells are grafted into muscles in a patient. 
Grafting the cells back into the donor has the 
advantages of a genetically corrected autologous 
transplant. Nucleic acid may be introduced locally 

10 into cells using transf ection, electroporation, 

microinjection, liposomes, lipofecting or as naked DNA 
or RNA, or using any other suitable technique. 

Retroviral vectors have also been used to 
introduce the dystrophin mini -gene into the myoblasts 

15 of spontaneously regenerating muscle of the mdx mouse 

to produce dystrophin-positive fibres [8] . Recombinant 
replication defective adenoviruses appear particularly 
effective as an efficient means of introducing 
constructs into skeletal muscle fibres for persistent 

20 expression [29] . See reference 11 for a review of 
myoblast -based gene therapies. 

Adenoviral, retroviral or other viral vectors may 
be used advantageously for the introduction of a 
utrophin sequence according to the present invention 

25 into muscle cells. Even though in vivo transduction 
may be restricted to growing or regenerating muscle 
fibres, retrovirally introduced constructs have the 
advantage of becoming integrated into the genome of the 
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host cell, potentially conferring lifelong expression. 

Liposomes may be used as vehicles for delivery of 
nucleic acid constructs to skeletal muscle. 
Intravenous injection of constructs in cationic 
5 liposomes has resulted in widespread transfection of 
most tissues, including skeletal muscle [30] . Lack of 
immunogenicity allows for repeated administration and 
lack of tissue specificity may be accommodated by 
choosing a muscle-specific promoter to drive 

10 expression. ! 
For use in distinguishing polypeptide with . 
utrophin function from dystrophin and related 
polypeptides, antibodies may be obtained using 
techniques which are standard in the art. Methods of 

15 producing antibodies include immunising a mammal (eg 

mouse, rat, rabbit, horse, goat, sheep or monkey) with 
the protein or a fragment thereof, or a cell or virus 
which expresses the protein or fragment. Immunisation 
with DNA encoding a target polypeptide is also possible 

20 (see for example Wolff, et al . Science 247: 1465-1468 
(1990); Tang, et al. Nature 356: 152-154 (1992); Ulmer 
J B, et al. Science 259: 1745-1749 (1993)). Antibodies 
may be obtained from immunised animals using any of a 
variety cf techniques known in the art, and screened, 

25 preferably using binding of antibody to antigen of 

interest. For instance, Western blotting techniques or 
immunoprecipitation may be used (Armitage et al , 1992, 
Nature 357: 80-82) . 
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The production of monoclonal antibodies is well 
established in the art. Monoclonal antibodies can be 
subjected to the techniques of recombinant DNA 
technology to produce other antibodies or chimeric 
5 molecules which retain the specificity of the original 
antibody. Such techniques may involve introducing DNA 
encoding the immunoglobulin variable region, or the 
complementarity determining regions (CDRs) , of an 
antibody to the constant regions, or constant regions 

10 plus framework regions, of a different immunoglobulin. 
See, for instance, EP184187A, GB 2188638A or EP-A- 
0239400. A hybridoma producing a monoclonal antibody 
may be subject to genetic mutation or other changes, 
which may or may not alter the binding specificity of 

15 antibodies produced. 

As an alternative or supplement to immunising a 
mammal with a peptide, an antibody specific for a 
protein may be obtained from a recombinant ly produced 
library of expressed immunoglobulin variable domains, 

2 0 eg using lambda bacteriophage or filamentous 

bacteriophage which display functional immunoglobulin 
binding domains on their surfaces; for instance see 
WO92/01047. The library may be naive, that is 
constructed from sequences obtained from an organism 

25 which has not been immunised with the target, or may be 
one constructed using sequences obtained from an 
organism which has been exposed to the antigen of 
interest (or a fragment thereof) . 
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Antibodies may be modified in a number of ways. 
Indeed the term "antibody" should be construed as 
covering any specific binding substance having an 
binding domain with the required specificity. Thus 
5 this covers antibody fragments, derivatives, functional 
equivalents and homologues of antibodies, including any 
polypeptide comprising an immunoglobulin binding 
domain, whether natural or synthetic. Chimaeric 
molecules comprising an immunoglobulin binding domain, 
10 or equivalent, fused to another polypeptide are 
therefore included. Cloning and expression of 
chimaeric antibodies are described in EP-A-0120694 and 
EP-A-0125023 . 

It has been shown that the function of binding 
15 antigens can be performed by fragments of a whole 

antibody. Example binding fragments are (i) the Fab 
fragment consisting of VL, VH, CL and CHI domains; (ii) 
the Fd fragment consisting of the VH and CHI domains,- 
(iii) the Fv fragment consisting of the VL and VH 
20 domains of a single antibody; (iv)' the dAb fragment 

(Ward, E.S. et al . , Nature 341, 544-546 (1989)) which 
consists of a VH domain; (v) isolated CDR regions; (vi) 
F(ab')2 fragments, a bivalent fragment comprising two 
linked Fab fragments (vii) single chain Fv molecules 
25 (scFv) , wherein a VH domain and a VL • domain are linked 
by a peptide linker which allows the two domains to 
associate to form an antigen binding site (Bird et al , 
Science, 242, 423-426, 1988; Huston et al, PNAS USA, 
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85, 5879-5883, 1988); (viii) bispecific single chain Fv 
dimers ( PCT/US92/09965 ) and (ix) " diabodies " , 
multivalent or mult ispecif ic fragments constructed by 
gene fusion (WO94/13804 ; P. Holliger et al Proc . Natl. 
5 Acad. Sci. USA 90 6444-6448, 1993). 

Further aspects and embodiments of the present 
invention, and modifications to aspects and embodiments 
disclosed herein, will be apparent to those skilled in 
the art . 

10 The following figures are attached hereto: 

Figure 1: Figuxe la shows the corresponding 
parts of nucleotide sequences of the mouse (Moutro) , 
rat (ratutro) and human (humutro) starting from the 
first amino acid and encompassing the actin binding 

15 domain and start of the rod domain. The heavyline 
reDresents the unclonable region in rat and human. 
Figure lb shows the sequence of the second unclonable 
region . 

Figure 2 is a schematic of the cloning process of 
20 PCR6 . 0 . Figure 2A: The bold numbers represent the 
utrophin transcript in kb and the numbers below the 
line represent the nucleotide positions of the regions 
in question where 1 is the start of translation; Figure 
2B represents the cDNAs used as template for the PCR; 
25 Figure 2C represents the two PCR fragments generated to 

form PCR 6 . 0 . 

Figure 3 shows the nucleotide sequence (both 
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strands) of a "utrophin mini -gene" according to the 
present invention and whose construction is described 
herein . 

Figure 4 shows a representation of the dystrophin 
5 and utrophin polypeptides, showing the various domains, 
and "mini -genes" comprising only parts of the full- 
length molecules . 

Figure 5: Utrophin transgene construction and 
expression. A, Scale representation of dystrophin, 
10 utrophin and the two truncated transgenes. The 

repeated spectrin- like repeats (R) and the potential 
hinge sites (H) are marked. B, Utrophin transgene 
vector. The N- and C-terminal portions of utrophin 
were cloned as PCR products using overlapping cDNAs as 
15 template. The regions used are indicated by the dotted 
lines. The PCR product was cloned into a vector 
containing the 2 . 2kb human skeletal or-actin (HSA) 
promoter and regulatory regions 20 ' 21 and SV4 0 large T 
poly A site. The cloning sites were such that the 
20 transgene was located near the beginning of the second 
HSA untranslated exon and the Asp718/NotI sites were 
used to liberate the complete fragment. C, Immunoblot 
of muscle from the utrophin transgenic line F-3 and a 
ndh- transgenic C57BL/10 littermate. M, skeletal muscle; 
2 5 H, heart; D, diaphragm. 

Figure 6: Decrease in serum CK levels and 
centralised myofibres in transgenic mdx mice. A, Serum 
creatine kinase levels in male mdx mice expressing the 
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utrophin transgene . Serum creatine kinase levels from 
5 week old mice generated from 4 F 3 litters resultant 
from a male transgenic mouse crossed with female mdx. 
Offspring consisted of male hemizygous mdx (M mdx) , 
5 male utrophin transgenic mdx (M Tg mdx) and 

heterozygous females (F mdx) . Female heterozygotes are 
not significantly different from wild type so can be 
used as normal controls. The number of mice (n) in 
each group is shown in parentheses and the mean SE 

10 shown by T-bars . B, Proportion of myofibres containing 
centralised nuclei. The mean SE is shown by T-bars. 

Figure 7: Decrease in centralised nuclei in 
Diaphragm and TA muscle from other truncated utrophin 
transgenic mdx lines (Gerald, George, Grant, Gavin), 

15 normal (n) and mdx (mdx) . The mean SE is shown by T- 
bars . 

Figure 8 : Decrease in serum creatine kinase from 
other utrophin transgenic lines, normal (n) and mdx 
(mdx) . The number of mice in each group is shown in 
20 parenthesis. 

Figure 9 : Full length utrophin coding sequence 
and encoded amino acid sequence. 

Figure 10 : Alignment of amino acid sequences for 
the N- terminal regions of human, mouse and rat 
25 utrophin. 

All documents cited are incorporated herein by 
reference . 
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Example 1 

Cloning of utrophin mini genes 

The four cDNAs covering the latter half of the 
human utrophin transcript were ligated together using 
5 overlapping restriction endonuclease sites. The amino- 
terminal region was reconstructed using the human 92.2 
cDNA joined by the common EcoRI restriction site to the 
stable mouse cDNA clone, JT1 . These two constructs 
were then used as templates for PCR amplification 

10 (Figure 2B) . Primers were designed to generate two 
fragments, PCR2 . 0 and PCR 4.0, containing no' 
untranslated regions which could be ligated in frame to 
generate a utrophin minigene containing approximately 
the first 2kb and last 4kb of the utrophin coding 

15 sequence (Figure 2C) . 

The two PCR fragments were ligated together using 
the Hpal site. The complete DNA sequence of the 6 . Okb 
minigene is shown in Figure 3 . The complete 6kb 
minigene was excised from the vector and ligated into 

2 0 the eukaryotic expression vectors. SV4 0-pA consists of 
the SV4 0 early promoter linked to exon 1 and part of 
exon 2 (including the intron) ■ of rabbit fi globin to 
facilitate splicing of any cloned insert. This is of 
particular importance if the construct is to be used to 

25 generate transgenic lines. After a single unique blunt 
restriction site for cloning inserts into, there is the 
SV4 0 small T poly A signal sequence. The SV4 0 promoter 
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will express the minigene in all tissues. The HSA-pA 
construct is similar except for the use of the human 
skeletal a actin promoter and tissue specific 
regulatory sequences which will direct expression of 
5 the minigene product only in skeletal muscle. 

Once cloned into the expression vectors the 
unique Hpal site was used to clone in a PCR generated 
fragment containing the remainder of the utrophin rod 
domain. We now have expression vectors containing a 
10 truncated and full length utrophin coding sequence. 

The unique Hpal restriction site has also been 
used to clone in a synthetic oligonucleotide coding for 
the amino acid sequence which is recognised by a 
specific antibody to the myc protein. This will enable 
15 minigene constructs to be localised by virtue of their 
expression of the myc tag and recognition by the 
antibody. For utrophin this is a problem as the 
endogenous gene is expressed in all cell types . The 
use of the tag will demonstrate the presence of the 
20 minigene when delivered in a gene therapy protocol. 

There are available a number of other tags including 
the Flag epitope (IBI) and Green Fluorescent Protein 
(Clontech) which could be used in a similar fashion. 

The utrophin minigene generated consists of the 
2 5 same domains and repeats as the dystrophin minigene 
(Figure 4) . The dystrophin minigene was originally 
copied in vitro from a naturally occurring dystrophin 
mutation which gave rise to a mild Becker muscular 
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dystrophy phenotype . It has been used successfully in 
a number of viral vectors being designed for potential 
gene therapy routes and in transgenic lines which 
ameliorate the abnormal muscle phenotype in max mice. 
5 Thus the utrophin minigene will be suitable for cloning 
into viral vectors designed for specific tissue 
expression in potential gene therapy procedures. 

Verification of the integrity of minigenes 

It was important to screen for the maintenance of 

10 an open reading frame in the PCR generated clones given 
the propensity for Tag thermostable polymerase to 
introduce mutations. The PCR products were cloned into 
a vector which had RNA polymerase binding sites 
allowing the cloned insert to be transcribed and 

15 translated generating a radiolabelled protein. If 
expressed proteins were observed of the correct 
molecular weight it was inferred that the PCR product 
had no stop mutations. These products were then 
western blotted to see if they were recognised by 

20 utrophin antibodies. A positive result demonstrated 

that the expressed protein was in the correct frame to 
generate the epitopes recognised by the antibodies. 
Ten different clones both for the PCR2 . 0 and PCR4 . 0 
were screened in this manner. In all cases full length 

2 5 expression was observed. All PCR2 . 0 and PCR4 . 0 clones 
were detected by MANNUT1 [31] (which recognises the 
actin binding domain) and MANCH07 [18] (which recognises 
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the latter half of the carboxy- terminal ) respectively. 

Two rninigenes were constructed from two different 
PCR2 . 0 and 4 . 0 clones which met the criteria above and 
cloned into the expression vectors . To check the 
5 integrity of the completed minigene, COS cells were 

transiently transfected with both SV40-PCR6.0 rninigenes 
(A4 and Bl) and harvested after time points. 
Expression of the PCR6 . 0 minigene protein was 
identified by western blotting using MANNUT1 [31] and 
10 MANCH0 7 [18] . 

Similar transf ections were done using these 
constructs then the cells fixed and immunostained using 
MANCH07 [18] . Staining of the minigene appeared to be 
membrane bound suggesting that the actin binding domain 
15 or the CRCT or both are functional in order to explain 
the staining pattern seen. 

The myc tag epitope has also been cloned in frame 
into the unique Hpal site within the minigene. This 
construct, SV4 0-PCR6 . 0-myc , was also transfected into 
2 0 COS cells and immunolocalised using the myc tag mouse 

monoclonal antibody, 9E10. Again membrane localisation 
was observed showing that introduction of the 10 amino 
acids which constitutes the myc tag epitope does not 
appear to effect the properties of the minigene. 

25 Example 2 

In Vivo Compensation for Dystrophic Deficiency by 
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Utrophin Expression 

We have tested expressing a utrophin transgene in 
the dystrophin deficient mdx mouse. Our results 
indicate that high expression of the utrophin transgene 
5 in skeletal muscle can reverse the dystrophic 

pathology. These data suggest that systemic up- 
regulation of utrophin in DMD patients is a very 
promising avenue for the development of an effective 
treatment for this devastating disorder. 
10 A truncated utrophin transgene was modeled on the 

Becker dystrophin transgene which has been shown to 
correct the dystrophic phenotype of mdx mice [5 - 6 3 (Fig. 
5A) . In order to generate high levels of muscle 
expression the utrophin transgene was driven by the 
15 human skeletal alpha actin (HSA) promoter (Fig. 5B) . A 
number of transgenic lines expressing the utrophin 
transgene were generated with differing levels of 
transgenic expression. Immunoblot analysis of muscle 
samples from transgenic lines demonstrating high level 
2 0 expression are shown in Fig. SC. The multiple fainter 
bands are probably due to the proteolytic breakdown of 
the highly expressed transgene product ^ 24 ^ . Line 347 
also shows weak expression of the transgene in the 
heart. Analysis of the F-3 line shows no evidence of 
25 transgene expression in heart, brain, kidney, lung, 
liver, intestine, skin or pancreas was observed. To 
demonstrate that the utrophin transgene localised to 
the sarcolemma, immunofluorescence of skeletal muscle 
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sections was performed using utrophin and dystrophin 
specific antibodies. Examination of the sarcolemmal 
localisation pattern of dystrophin and the utrophin 
transgene in consecutive muscle sections demonstrated 
5 that they are able to co- localise in vivo. The normal 
localisation of utrophin in adult skeletal muscle is 
exclusively at the neuromuscular and myotendenous 
junctions and in the capillaries and nerves [3 - 31 ^ . 
Immunostainmg of unfixed 8/im TA muscle cryosections 

10 was done with 1/25 dilution of G3 (anti -utrophin) or 
1/400 dilution of P6 (anti -dystrophin [33] ). Initially 
the sections were blocked in 10% heat inactivated 
foetal calf serum in 50mM Tris, 150mM NaCl pH7 . 5 (TBS) , 
then the primary antibody diluted in TBS added and 

15 incubated for lh at room temperature. The slides were 
washed 4x in TBS for 5min each then incubated for a 
further hour at room temperature with 1/1000 dilution 
FITC conjugated sheep anti-rabbit IgG (Sigma) diluted 
in TBS. Finally the slides were washed as before, 

2 0 mounted with VectaShield (Vector Labs) and photographed 
using a Leica DMRBE microscope and photomicrograph 
system . 

Although the dystrophin deficient mdx mouse is 
only mildly affected, histological and physiological 
2 5 analysis reveals a number of muscle defects in common 
with DMD patients including muscle fibre degeneration 
giving rise to a dramatic elevation of serum creatine 
kinase (CK) and evidence of massive myofibre 
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regeneration with most fibres having centrally located 
nuclei [3A ] Thus changes in the levels of serum CK and 
numbers of centralised nuclei have been used to monitor 
the pathology of the muscle in a number of transgenic 
5 lines expressing dystrophin transgenes in mdx mice 
[5,6,7.24] Male transgenic F-3 mice carrying the 
utrophin transgene were crossed with dystrophin 
deficient female mdx mice and the resultant offspring 
analysed (Fig. 6A) . The CK levels of 5 week old male 

10 transgenic mdx mice had fallen to approximately a 

quarter of the non transgenic mdx male littermates. 
Females whether transgenic or not have essentially 
normal levels of serum CK. The reduction in the serum 
levels of CK in the transgenic male mdx littermates 

15 signifies a change in the muscle pathology of. these 

mice and implies that a significant decrease in muscle 
degeneration has occurred. Fig. 6B shows the contrast 
in numbers of centralised nuclei in frozen sections 
from the soleus and tibialis anteria (TA) muscle -of 

20 transgenic and non -transgenic male mdx mice. The 

numbers of centrally nucleated myofibres is markedly 
reduced in the two muscle types examined showing that 
the amount of fibre regeneration is decreased. The 
difference in numbers of central nuclei between the 

25 transgenic mdx TA (-10%) and soleus (-30%) is probably 
explained by the fact that the HSA promoter is 
expressed at lower levels in the slow twitch fibres 
which essentially populate- the soleus muscle compared 
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to the fast twitch fibres of the TA. This is an 
important observation as it implies that the levels of 
utrophin transgene are important for amelioration of 
the muscle phenotype . 
5 Dystrophin is normally associated with a large 

oligomeric protein complex (dystrophin protein complex; 
DPC) embedded in the sarcolemma [ 3 - 35 5. Loss of 
dystrophin in DMD patients and mdx mice also results in 
a dramatic loss of sarcolemmal DPC [3€) . In transgenic 

10 mdx mice expressing the full length and truncated 

dystrophin transgenes, re-establishment of components 
of the DPC at the sarcolemma is an important marker for 
the restoration of muscle strength by dystrophin 
transgenes t 5 < 6 ' 7 - 24] . We looked at the results of 

15 immunostaining for components of the DPC in TA muscle 

from male mdx or mdx expressing the utrophin transgene. 
This was as described above. The primary antibodies 
were goat polyclonal sera to a/iS-dystroglycan [37] (FP- 
B, 1/10) , rabbit polyclonal sera to a-sarcoglycan 1361 

20 (1/5) and sheep polyclonal sera to -y-sarcoglycan 1391 
(1/10) . FITC conjugated secondary antibodies to goat, 
rabbit and sheep were diluted 1/50, 1/200 and 1/50 
respectively. Sarcolemmal staining of all myofibres by 
utrophin specific antibody was seen in transgenic 

25 muscle. However in the non- transgenic mice there is 
virtually no sarcolemmal staining apart from 
neuromuscular junctions and regions likely to contain 
regenerating fibres. In all cases using polyclonal 
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antibodies specific no a-sarcoglycan , y-sarcoglycan 
and a//3-dystroglycan there was a notable increase in 
the staining at the sarcolemma of transgenic TA muscle 
indicating an elevation in correctly localised, 
5 sarcolemmal bound DPCs . The increase in sarcolemmal 
staining of these components in the soleus muscle is 
greater than the non- transgenic mdx males but not as 
elevated as in the TA. This result suggests that 
increased utrophin transgene expression correlates with 

10 an increase in sarcolemmal bound DPC. 

Analysis of the mdx diaphragm has shown that this 
muscle exhibits a continued pattern of degeneration, 
fibrosis and functional deficit throughout the lifespan 
of the mdx mouse which is comparable to DMD skeletal 

15 muscle 14 0] . Thus for utrophin to be capable of 

replacing dystrophin, over-expression of utrophin in 
this muscle has to alter the pathology in a similar way 
as demonstrated for the dystrophin transgenic mdx mice 
[5,6,7,24] _ immunostaining of diaphragm sections using a 

2 0 utrophin antibody demonstrates the sarcolemmal 

localisation of the utrophin transgene expressed in the 
transgenic mdx mouse (utro-tg mdx) compared to the 
normal and mdx seen is the re -establishment of a- 
sarcoglycan at the sarcolemma of the transgenic mdx 

25 diaphragm at levels similar to the normal diaphragm. 
Sarcolemmal staining of the transgenic mdx diaphragm 
similar to normal is also seen using antibodies 
specific to c*//3-dystroglycan and y-sarcoglycan (data 
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not shown) . Thus, as in skeletal muscle, expression of 
the utrophin transgene in diaphragm relocalises the DPC 
to the sarcolemma . Histological analysis of 
haematoxylin and eosin stained sections of mdx 
5 diaphragm shows extensive regions of fibrosis, cellular 
infiltration and variable myofibre size containing 
centralised nuclei. However the utrophin transgenic 
diaphragm looks essentially the same as normal, with no 
necrosis, regular myofibre size and virtually no 
10 centralised nuclei. In the mdx diaphragm, even in 
regions which have no necrosis so appear more 
histologically normal, on higher magnification the 
myofibres are still of variable size often containing 
centralised nuclei which is indicative of continual 
15 regeneration. In the utrophin transgenic diaphragm 

even at higher magnification, the whole muscle appears 
normal. A return to normal histology and establishment 
of the DPC are two important observations, as seen with 
the dystrophin transgenic mice 15,6,7,24} ( which predicts 
20 a major recovery of the utrophin transgenic diaphragm 
from a dystrophic phenotype . 

We have demonstrated a significant decrease in 
the dystrophic muscle phenotype of mdx mice by 
expressing a utrophin transgene at high levels in the 
25 skeletal muscle and the diaphragm. These results, for 
the first time, strongly suggest that utrophin can 
replace dystrophin in vivo . This implies that use of 
small molecules which increase the normal utrophin 
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muscle expression to compensate and therefore alleviate 
the consequences of a lack of dystrophin is a promising 
avenue for DMD therapy. This approach would 
potentially target all muscles and thus prolong life by 

5 conserving the respiratory and cardiac muscles . 

Utrophin is expressed in many tissues so a generalised 
upregulation may not have detrimental side effects. In 
our experimental animal model, the normal mice 
expressing the utrophin transgene at high levels appear 

0 to suffer no deleterious side effects in their skeletal 
and diaphragm muscles . A precedent for such a gene 
therapy approach using butyrate to upregulate fetal 
haemoglobin is having success in clinical trials of 
sickle cell disease I 41 - 42 3. Only 20-30% of the wild 

5 type levels of dystrophin are required to significantly 
reduce the dystrophic phenotype in mdx mice 9 - 10 . it 
will be interesting to determine whether similar levels 
of utrophin will be adequate to compensate for 
dystrophin loss. In addition, since utrophin is 

0 normally expressed in all tissues, including muscle, 
the use of this utrophin transgene rather than a 
dystrophin transgene in conventional gene therapy 
approaches e.g. using viruses or liposomes may avert 
any potential immunological responses against the 

5 transgene . 

Muscle from transgenic m<±x and mdx mice were 
stressed In vitro. Essentially the test monitors the 
high mechanical stress produced by force lengthening 
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during active contraction and measures the decrease in 
force. The method is essentially described in detail 
by Deconinck et a 2 [46] . The measure of deterioration 
is a decrease in the force a muscle can apply. This 
5 force drop is irreversible and correlates with the 
number of damaged muscle fibres. Mdbc muscle is 
particularly sensitive to this test and deteriorates 
greatly t46] . 

Our data demonstrate that the force drop in mdx 

10 mice is -55%. However, in the utrophin transgenic 

littermates the force drop was only -20%. Normal mouse 
muscle usually has a force drop of -15%. Thus the 
expression of the utrophin transgene in mdx mice 
considerably decreases the damage caused by large 

15 mechanical stress. 

Methods 

Transgene construction and microinjection 

The amino- and carboxy- terminal portions of 
utrophin were cloned as PCR products using overlapping 

20 cDNAs as template then ligated together in-frame to 

produce the truncated utrophin cDNA. The PCR product 
was then cloned into a vector containing the 2.2kb 
human skeletal a-actin (HSA) promoter and regulatory 
regions 1 43 ' 44] and SV4 0 large T poly A site. The 

25 cloning sites were such that the transgene was located 
near the beginning of the second HSA untranslated exon 
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and the Asp7lB/NotI sites were used to liberate the 
complete fragment. Transgenic mice were generated by 
microinjection of the purified HSA transgene insert 
into the pronucleus of F 2 hybrid oocytes from 
5 C57BL/ 6xCBA/CA parents 145:1 . Positive transgenic mice 
were identified by southern blotting using a probe to 
the central part of the utrophin transgene . A number 
of founder F 0 males were bred to generate more 
offspring for analysis and breeding. 



. 0 Protein analysis 

Total muscle extracts were prepared by 
homogenisation in 1ml extraction buffer (75mM Tris 
pH6.8, 3.8% SDS, 4M Urea, 20% Glycerol, 5% 0- 
mercaptoethanol ) then heated 95 *C for 5min . Usually 

5 50^g of total protein (guantitated using Biorad DC 
protein assay kit) was loaded onto 6% polyacrylamide 
gels and transferred to nitrocellulose. Utrophin 
transgene expression was detected using a. 1/200 
dilution of mouse anti-utrophin monoclonal antibody 

0 (MANCH07 ll8] ) and visualised using anti-mouse IgG-POD 
and chemiluminescence (Boehringer) . For sectioning, 
skeletal muscle samples were removed and immersed in 
OCT compound (BDH) and frozen in liquid nitrogen cooled 
isopentane. Diaphragm was removed, cut in half then 

5 rolled longitudinally and sandwiched between Ox liver 
to facilitate orientation and easier sectioning. The 
sandwich was then frozen. Immunostaining of unfixed 
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8^m cryosections was performed by blocking the sections 
in 10% heat inactivated foetal calf serum in 50mM Tris, 
150mM NaCl pH7 . 5 (TBS), then the primary antibody 
diluted in TBS added and incubated for lh at room 
5 temperature. The slides were washed 4x in TBS for 5min 
each then incubated for a further hour at room 
temperature with conjugated second antibody diluted in 
TBS. Finally the slides were washed as before, mounted 
with VectaShield (Vector Labs) and photographed using a 
10 Leica DMRBE microscope and photomicrograph system. 



Antibodies used for immunofluorescence 

Antibodies were used at the following dilutions. 
Polyclonal rabbit against utrophin (G3, 1/25), 
dystrophin (PS (33] , 1/400), £l-syntrophin (syn35, 1/50) 
15 a-sarcoglycan 1383 (1/5) . Goat polyclonal against a/(3- 
dystroglycan (FP-B [371 , 1/10). FITC conjugated 
secondary antibody to goat (Sigma) and Cy3 conjugated 
secondary antibody to rabbit (Jackson Laboratories) 
were diluted 1/50 and 1/200 respectively. 

20 Creatine kinase assay- 
Serum CK levels from 4-5 week old mice generated 
from 4 F 3 litters resultant from a male transgenic 
mouse crossed with female mdx were assayed. The tail 
tips were cut off and DNA prepared for Southern 

25 blotting to establish the transgenic status of each 

mouse. Blood was collected simultaneously, allowed to 
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Serum creatine kinase levels 
Boehringer NAC-CK kit and 5[il 
minute was averaged over 4 min 



5 Example 3 

To see if expression of utrophin is beneficial to 
muscle in the process of regenerating, .the myc tagged 
truncated utrophin minigene under the control of the 
HSA promoter (HSA-PCR6 . 0-myc) was directly injected 

10 into mdx muscle. 

Our data demonstrate the sarcolemmal localisation 
of the utrophin minigene in a proportion of fibres 
close to the injection site. The utrophin minigene was 
detected using the antibody 9E10 which is specific to 

15 the myc tag epitope. Importantly where 9E10 was 

localised there was a significant staining of a- and -y- 
sarcoglycan. ■ The a- sarcoglycan staining was 
essentially negative in other fibres. 

Re-establishment of the dystrophin protein 

2 0 complex has been shown to be an important marker for 
muscle recovery f 5 ' 6 - 7 - 24 ^. This result suggests that 
even when the disease process has manifested itself, 
namely the degeneration and regeneration seen in mcbc 
muscle, expression of utrophin is beneficial. This is 

25 important when considering that in DMD one third of 
effected boys are new mutations. Thus only when the 
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first symptoms of DMD manifest themselves after a 
couple of years after birth can diagnosis be attained. 



Example 4 

Utilising a PCR strategy to generate fragments 
5 from human 1st strand DNA, the remainder of the human 
utrophin sequences missing from PCR6 . 0 were cloned. 
The fragment utilised primers which allowed the rod 
domain to be cloned into the unique Hpal restriction 
site (see Figure 2c) to produce a clone which contained 

10 all of the amino acid coding sequence to produce the 
complete utrophin protein. Figure 9 shows the DNA 
sequence of the full length utrophin construct with the 
amino acid sequence shown above using the standard 
single letter code. 

15 The utrophin full length construct has been 

cloned into the human skeletal alpha act in promoter 
(HSA) expression construct in a similar manner to that 
shown in Figure 5b. This full length utrophin 
expression construct has been used to generate 

20 transgenic mice capable of expressing the full length 
utrophin protein in mouse muscle. Similar experiments 
may be performed as described in Example 2 to identify 
any differences in the effectiveness of the full length 
utrophin protein compared with the truncated utrophin 

25 protein in alleviating the muscle pathology in mdx 
mice . 
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In order to assess whether high levels of 
utrophin expression in all tissues is detrimental, to 
assist in planning therapeutic protocols and in 
particular choosing between tissue- specif ic expression 
5 or non-specific expression, a mouse model is being 
developed using the full length utrophin construct . 
Transgenic mice will be created expressing the full 
length utrophin protein under the regulation of a 
promoter which is expressed in all tissues. The 

10 promoter chosen for the first experiments is the human 
Ubiquitin-C promoter which has been shown to express in 
all tissues. Once these mice are shown to be 
expressing the full length utrophin transgene they will 
be monitored to identify any potential side effects 

15 caused by abnormally high levels of utrophin. 
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CLAIMS 

1 . A nucleic acid isolate having a sequence of 
nucleotides encoding a polypeptide including the amino 
acid sequence of Figure 3 . 

5 

2. Nucleic acid according to claim 1 wherein' said 
sequence of nucleotides is the coding nucleotide 
sequence of Figure 3 . 

10 3 . Nucleic acid according to claim 1 wherein said 

sequence of nucleotides is a variant or derivative, by- 
way of one or more of addition, substitution, insertion 
and deletion of one or more nucleotides, of the coding 
nucleotide sequence of Figure 3. 

15 

4 . A nucleic acid isolate having a sequence of 
nucleotides encoding a polypeptide which is able to 
bind actin and able to bind the dystrophin protein 
complex, which includes an amino acid sequence which is 

20 a variant or derivative, by way of one or more of 

addition, substitution, insertion and deletion or one 
or more amino acids, of the amino acid sequence of 
Figure 3, and which is distinguishable immunologically 
from dystrophin. 

25 

5. Nucleic acid according to claim 4 wherein the 
polypeptide includes the amino acid sequence of Figure 
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9 . 

6 . Nucleic acid according to claim 5 having the 
coding nucleotide sequence of Figure 9. 

5 

7 . Nucleic acid according to claim 5 having a' coding 
nucleotide sequence which is a variant or derivative, 
by one or more of addition, deletion, insertion or 
substitution of one or more nucleotides, of the coding 

10 sequence of Figure 9. 

8 . Nucleic acid according to any preceding claim 
comprised in a vector. 

15 9 . Nucleic acid according to claim 8 wherein said 
vector is an expression vector. 

10. A composition including nucleic acid according to 
any of claims 1 to 9 and a pharmaceutical ly acceptable 

20 excipient. 

11. A cell containing nucleic acid according to any of 
claims 1 to 9 . 

25 12 . A cell according to claim 11 which is a muscle 
cell . 
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13 . A cell according to claim 11 or claim 12 wherein 
said polypeptide is expressed. 

14 . A cell according to any of claims 11 to 13 which 
5 is in a mammal . 

15. A mammal having a cell according to any of claims 
11 to 13. 



10 16. A mammal containing nucleic acid according to any 
of claims 1 to 9. 

17. A method including introduction of nucleic acid 
according to any of claims 1 to 9 into a cell. 

15 

18. A method according to claim 17 wherein said cell 
is a muscle cell. 

19. A method according to claim 17 or claim 18 wherein 
20 said introduction takes place in vitro. 

20. A method which includes causing or allowing 
expression of the coding nucleotide sequence of nucleic 
acid according to any of claims 1 to 9 in a cell. 

25 

21. A method according to claim 20 wherein the cell is 
part of a mammal . 



BNSDOCID: <WO S722696A1J_> 



WO 97/22696 PCT/GB96/03156 

53 

22. A method according to claim 20 wherein the 
expression product is purified and/or isolated 
following expression. 

5 23 . A method according to claim 22 wherein the 

expression product is formulated into a composition 
which includes at least one additional component, 
following purification and/or isolation of the 
expression product. 

10 

24 . A polypeptide as encoded by nucleic acid according 
to any of claims 1 to 3 . 

25. A polypeptide as encoded by nucleic acid according 
15 to claim 4, excluding natural utrophin. 

26. A composition including a polypeptide according to 
claim 24 or claim 25 and a pharmaceutically acceptable 
excipient . 

20 

27. A method for ameliorating one or more symptoms of 
a dystrophic phenotype in a mammal, the method 
including providing cells of the mammal with a 
polypeptide according to claim .24 . 

25 

2B. A method for ameliorating one or more symptoms of 
a dystrophic phenotype in a mammal, the method 
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including providing cells of the mammal with a 
polypeptide according to claim 25. 



29. A method according to claim 27 or claim 28 wherein 
5 the polypeptide is provided to the cells by expression 

from encoding nucleic acid administered to the 'mammal. 

30. Use of nucleic acid according to any of claims 1 
to 9 in the manufacture of a medicament for treating a 

10 dystrophin phenotype in a mammal. 
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Figure 3 

MAKYCEH.EA S P D N G Q N E 
I ACTAGTCAAGATGGCCAAGTATGGAG AACATGAAGCCAGTCCTGACAATGGGCAGAACGA 6 0 

FSDIIESRSDEHNDVQKKTF 
61 ATTCAGTGACATCATTGAGTCCAGATCTGATGAACACAATGATGTACAGAAGAAAACCTT 12 0 

TKWINARFSKSGKPPISDMF 
121 TACCAAATGGATAAACGCTCGATTTTCCAAGAGTGGGAAACCACCCATCAGTGATATGTT 180 

SDLKDGRKLLDL LEGLTGT'S 
181 CTCAGACCTCAAAGATGGGAGAAAGCTCTTGGATCTTCTCGAAGGCCTCACAGGAACATC 2 4 0 

L PKERG STRVHA LNNVNRV L 
241 ATTGCCAAAGGAACGTGGTTCCACAAGGGTGCATPCCTTAAACAATGTCAACCGAGTGCT 3 00 

QVLHQNNVDLVNIGGTDIVD 

3 01 ACAGGTTTTACATCAGAACAATGTGGACTTGGTGAATATTGGAGGCACGGACATTGTGGA 3 60 

GNPKLT LGLLWS I I LHWQVK 
361 TGGAAATCCCAAGCTGACTTTAGGGTTACTCTGGAGCATCATTCTGCACTGGCAGGTGAA 420 

DVMKDI MSDLQQTNSEKILL 
421 GGATGTCATGAAAGATATC ATGTCAGACCTGCAGC AGAC AAACAGCG AGAAG ATCCTGCT 4 8 0 

SWVRQTTRPYSQVNVLNFTT 

4 81 GAGCTGGGTGCGGCAGACCACCAGGCCCTACAGTCAAGTCAACGTCCTCAACTTCACCAC 5 4 0 

SWTDGLAFN AVLHRHKPDLF 
541 CAGCTGG ACCGATGGACTCGCGTTCAACGCCGTGCTCCACCGGC AC AAACCAGATCTCTT 600 

SWDRVVKMSPIERLEHAFSK 
6 01 C AGCTGGGACAGAGTGGTCAAAATGTCCCCAATTGAGAGACTTGAACATGCTTTTAGCAA 66 0 

AHTYLGIEKLLDPEDVAVHL 

6 61 GGCCCACACTTATTTGGGAATTGAAAAGGTTCTAGATCCTGAAGATGTTGCTGTGCATCT 72 0 

PXXXXXXXXXXXXVEVLPQQ 

721 ccc^w^IN^mN^m^n^^N^^^^tf^n^^ 780 

VTIDAI REVETL P RKYKKEC 

7 81 ACTCACGATAGATGCCATCCGAGAGGTGGAGACTCTCCCAAGGAAGTATAAGAAAGAATG 8 4 0 

EEEE1H I QSAVLA EEGQSPR 
34 1 TGAAGAGGAAGAAATTCATATCCAGAGTGCAGTGCTGGCAGAGGAAGGCCAGAGTCCCCG 90 0 

AETPSTVTEVDMDLDSYQIA 
901 AGCTGAGACCCCTAGCACCGTCACTGAAGTGGACATGGATTTGGACAGCTACCAGATAGC 9 60 

LEEVLTWLLSAEDTFQEQDD 
9 61 GCTAG AGGAAGTGCTGACGTGGCTGCTGTCCGCGGAGGACACGTTCCAGGAGCAAGATGA 10 2 0 

I SDDVE EVKEQFATHETFMM 
1021 CATTTCTGATGATGTCGAAGAAGTCAAAGAGCAGTTTGCTACCCATGAAACTTTTATGAT 1080 

ELTAHQSSVGSVUQAGNOLM 
1 OB 1 CGAGCTGACAGCACACCAGAGCAGCCTGGGGAGCGTCCTC.CAGGCTGGCAACCAGCTGAT 114 0 

TQGTLS EEEEFE I QEQMTLL 
1141 G ACAC AAGGGACTCTGTCAGAGG AGG AGGAGTTTG AGATCCAGGAACAGATGACCTTGCT 12 00 
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NARWEA LRVESMERQSRLHD 
12 01 GAATGC AAGGTGGGAGGCGCTCCGGGTGGAGAGCATGGAGAGGCAGTCCCGGCTGCACGA 1260 

ALMELQKKQLQQLSSWLALT 

12 61 CGCTCTGATGGAGCTGCAGAAGAAACAGCTGCAGCAGCTCTCAAGCTGGCTGGCCCTCAC 1320 

EERIQKMESPPLGDDLPSLQ 
1321 AGAAGAGCGCATTCAGAAGATGGAGAGCCCTCCGCTGGGTGATGACCTGCCCTCCCTGCA 1380 

KLLQEHKSLQNDL, EAEQVKV 

13 81 GAAGCTGCTTCAAGAACATAAAAGTTTGCAAAATGACCTTGAAGCTGAACAGGTGAAGGT 1440 

NSLTHKVVIVDENSGESATA 
1441 AAATTCCTTAACTCACATGGTGGTGATTGTGGATGAAAACAGTGGGGAGAGTGCCACAGC 1500 

LLEDQLQKLGERWTAVCRWT 
1501 TCTTCTGGAAGATCAGTTACAGAAACTGGGTGAGCGCTGGACAGCTCTATGCCGCTGGAC 1560 

EERWNRLQEISILWQELLEE 
1561 TGAAGAACGTTGGAACAGGTTGCAAGAAATCAGTATTCTGTGGCAGGAATTATTGGAAGA 16 20 

QCLLEAWLTEKEEALNKVQT 
1621 GCAGTGTCTGTTGGAGGCTTGGCTGACCGAAAAGGAAGAGGCTTTGAATAAAGTTCAAAC 1680 

SNFKDQKELSVSVRRLAILK 
1681 CAGCAACTTTAAAGACCAGAAGGAACTAAGTGTCAGTGTCCGGCGTCTGGCTATATTGAA 1740 

EDMEMKRQTLDQLSEIGQDV 
1741 GGAAGACATGGAAATGAAGAGGCAGACTCTGGATCAACTGAGTGAGATTGGCCAGGATGT 1800 

GQLLSNP KASKKMNSDS EE L 
18 01 GGGCCAATTACTCAGTAATCCCAAGGCATCTAAGAAGATGAACAGTGACTCTGAGGAGCT 1860 

TQRWDSLVQRLEDSSNQVTQ 
1861 AACACAGAGATGGGATTCTCTGGTTCAGAGACTCGAAGACTCTTCTAACCAGGTGACTCA 1920 

AVAKLGMSQ IPQKDLLETVH 
1921 GGCGGTAGCGAAGCTCGGCATGTCCCAGATTCCACAGAAGGACCTATTGGAGACCGTTCA 1980 

VREKGMVKKPKQELPPPLTK 
1981 TGTGAGAGAAAAAGGG ATGGTGAAGAAGCCCAAGC AGGAACTGCCTCCTCCGTTAACAAA 2040 

AEHAMQKRSTTELGENLQEL 
2041 GGCTGAGCATGCTATGCAAAAGAGATCAACCACCGAATTGGGAGAAAACCTGCAAGAATT 

RD' LTQEMEVHAEKLKWLNRT 
2101 AAGAGACTTAACTCAAGAAATGGAAGTACATGCTGAAAAACTCAAATGGCTG AATAGAAC 

ELEKLSDKS LSLPERDKISE 
2161 TGAATTGGAGATGCTTTCAGATAAAAGTCTGAGTTTACCTGAAAGGGATAAAATTTCAGA 222 0 

SLRTVNMTWNKICREVPTTL 
2221 AAGCTTAAGGACTGTAAATATG ACATGG AATAAGATTTCCAGAGAGGTGCCTACCACCCT 

KEC1QEPSSVSQTRIAAHPN 
22 81 G AAGGAATGCATCC AGG AGCCCAGTTCTGTTTCACAGAC AAGGATTGCTGCTC ATCCTAA 

VQKVVLVSSASDI PVQSHRT 
2 3 41 TGTCCAAAAGGTGGTGCTAGTATCATCTGCGTCAGATATTCCTGTTCAGTCTCATCGTAC 2 4 00 

SEISIPADLDKTITELADWL 
24 01 TTCGGAAATTTCAATTCCTGCTGATCTTGATAAAACTATAACAGAACTAGCCGACTGGCT 2 4 60 
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Figure 3 Continued 

VLIDQMLKSNIVTVGDVEEI 
2461 GGTATTAATCGACCAGATGCTGAAGTCCAACATTGTCACTGTTGGGGATGTAGAAGAGAT 2 520 

NKTVSRMKITKADLEQRHPQ 
2521 CAATAAGACCGTTTCCCGAATGAAAATTACAAAGGCTGACTTAGAACAGCGCCATCCTCA . 2580 

LDYVFT LAQNL KNKAS SSDK 
25B1 GCTGGATTATGTTTTTACATTGGCACAGAATTTGAAAAATAAAGCTTCCAGTTCAGATAT 2 640 

RTAITEKLERVKNQWDGTQH 

2 641 GAGAACAGCAATTACAGAAAAATTGGAAAGGGTCAAGAACCAGTGGGATGGCACCCAGCA 2700 

GVELRQQQLEDMIIDSLQWD 
2701 TGGCGTTGAGCTAAGACAGCAGCAGCTTGAGGACATGATTATTGACAGTCTTCAGTGGGA 2760 

DHREETEELMRKYEARLYIL 
27 61 TGACCATAGGGAGGAGACTGAAGAACTGATGAGAAAATATGAGGCTCGACTCTATATTCT 2 82 0 

QQARRDPLTKQISDNQILLQ 
2821 TCAGCAAGCCCGACGGGATCCACTCACCAAACAAATTTCTGATAACCAAATACTGCTTCA 2860 

ELGPGDCIVMAFDNVLQKLL 
2881 AGAACTGGGTCCTGGAGATGGTATCGTCATGGCGTTCGATAACGTCCTGCAGAAACTCCT 2940 

EEYG S DDTRNVKETTE YLKT 
2941 GGAGGAATATGGGAGTGATGACACAAGGAATGTGAAAGAAACCACAGAGTACTTAAAAAC 3000 

SWIN LKQSIADRQNALEAEW 
3001 ATCATGGATCAATCTCAAACAAAGTATTGCTGACAGACAGAACGCCTTGGAGGCTGAGTG 3060 

RTVQASRRDLENFLKWIQEA 
3061 GAGGACGGTGCAGGCCTCTCGCAGAGATCTGGAAAACTTCCTGAAGTGGATCCAAGAAGC 3120 

ETTVNVLVDASHRENALQDS 
3121 AGAGAC CAC AGTGAATGTGCTTGTGGATGCCTCTCATCGGGAGAATGCTCTTC AGGATAG 3180 

I LARELKQQMQDIQAEIDAH 
3181 TATCTTGGCCAGGGAACTCAAACAGCAGATGCAGGACATCCAGGCAGAAATTGATGCCCA 3 2 40 

NDI FKS IDGNRQKMVKALGM 
3241 CAATGACATATTTAAAAGCATTGACGGAAACAGGCAGAAGATGGTAAAAGCTTTGGGAAA 33 00 

SEEATMLQHRLDDMNQRWND 
33 01 TTCTGAAGAGGCTACTATGCTTCAACATCGACTGGATGATATGAACCAAAGATGGAATGA 33 60 

L K A K S A S I RAH LEASA EKWN 

3 3 61 CTTAAAAGCAAAATCTGCTAGCATCAGGGCCCATTTGGAGGCCAGCCCTGAGAAGTGGAA 34 20 

RLLMSLEELIKWLNMKDEEL 
3421 CAGGTTGCTGATGTCCTTAGAAGAACTGATCAAATGGCTGAATATGAAAGATGAAGAGCT 34 80 

K K Q M PIGGDVPALQLQYDHC 
3481 TAAGAA.^CAAATGCCTATTGGAGGAGATGTTCCAGCCTTACAGCTCCAGTATGACCATTG 3540 

KALRRELKEKEYSVLNAVDQ 
3 541 TAAGGCCCTGAGACGGGAGTTAAAGGAGAAAGAATATTCTGTCCTGAATGCTCTCGACCA 3 600 

ARVF LADQPIEAPEEPRRNL 
3601 GGCCCGAGTTTTCTTGGCTGATCAGCCAATTGAGGCCCCTGAAGAGCCAAGAAGAAACCT 3660 

QSKTELTPEERAQKIAKAMR 
3661 ACAATCAAAAACAGAATTAACTCCTGAGGAGAGAGCCCAAAAGATTGCCAAAGCCATGCG 3720 
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Figure 3 Continued 

HQS SEVK EKWESLNAVTSNW 
3721 CAAACAGTCTTCTGAAGTCAAAGAAAAATGGGAAAGTCTAAATGCTGTAACTAGCAATTC 3780 

QKQVDKALEKLRDLQGAMDD 
3781 GCAAAAGCAAGTGGACAAGGCATTGGAGAAACTCAGAGACCTGCAGGGAGCTATGGATGA : 3840 

L D A DMK EAESVRNGWK PVGD 
3841 CCTGGACGCTGACATGAAGGAGGCAGAGTCCGTGCGGAATGGCTGGAAGCCCGTGGGAGA 3900 

LLIDSLQDHIEKIMAFREEI 
3 9 01 CTTACTCATTG ACTCGCTGC AGGATCACATTGAAAAAATCATGGCATTTAGAG AAG AAAT 3960 

APINFKVKTVNDLSSQLSPL 

3 9 51 TGCACCAATCAACTTTAAAGTTAAAACGGTGAATGATTTATCCAGTCAGCTGTCTCCACT 4020 

DLH PSLKMSRQLDDLNMRWK 

4 021 TGACCTGCATCCCTCTCTAAAGATGTCTCGCCAGCTAGATGACCTTAATATGCGATGGAA 4080 

LLQVSVDDRLKQLQEAHRDF 
4 081 ACTTTTACAGGTTTCTGTGGATGATCGCCTTAAACAGCTTCAGGAAGCCCACAGAGATTT 4140 

GPSSQHFLSTSVQLPWQRSI 
4141 TGGACCATCCTCTCAGCATTTTCTCTCTACGTCAGTCCAGCTGCCGTGGCAAAGATCCAT 4200 

SHNKVP yyiNHQTQTTCWDK 
4201 TTCACATAATAAAGTGCCCTATTACATCAACCATCAAACACAGACCACCTGTTGGGACCA 42 60 

PKKTELFQSLADLNNVRFSA 
4261 TCCTAAAATGACCGAACTCTTTCAATCCCTTGCTGACCTGAATAATGTACGTTTTTCTGC 4320 

YRTAIKIRRLQKALCLDLLE 
4321 CT AC CGT ACAG C AATCAAAATCCGAAG ACTACAAAAAGC ACT ATGTTTGG ATCTCTT AG A 4380 

LSTTNEI FKQHKLNQNDQLL 
43 81 GTTGAGTACAACAAATGAAATTTTCAAACAGCACAAGTTGAACCAAAATGACCAGCTCCT 4440 

SVPDVINCLTTTYDGLEQMH 
4441 CAGTGTTCCAGATGTCATCAACTGTCnSACAACAACTTATGATGGACTTGAGCAAATGCA 4 500 

KDLVNVPLCVDMCLNWLLNV 
4 501 TAAGGACCTGGTCAACGTTCCACTCTGTGTTGATATGTGTOTCAATTGGTTGCTCAATGT 4 560 

YDTGRTGKIRVQSLKIGLMS 
4561 CTATGACACGGGTCGAACTGGAAAAATTAGAGTGCAGAGTCTGAAGATTGGATTAATGTC 4 620 

L S KG L L E E KYRYLFKEVAG P 
4621 TCTCTCCAAAGGTCTCTTGGAAGAAAAATACAGATATCTCrrTTAAGGAAGTTGCGGGGCC 4680 

TEMCDQRQLGLLLHDAIQI P 
4 681 GACAGAAATGTGTGACCAGAGGCAGCTGGGCCTGTTACTTCATGATGCCATCCAGATCCC 47 40 

RQLG EVAAFGCSNIEPSVRS 
4741 CCGGCAGCTAGGTGAAGTAGCAGCTTTTGGAGGCAGTAATATTGAGCCTAGTGTTCGCAG 4 800 

CFQQNNNKPEISVKEFIDWM 
4 801 CTGCTTCCAACAGAATAACAATAAACCAGAAATAAGTGTGAAAGAGTTTATAGATTGGAT 4 860 

KLEPQSMVWLPVLHRVAAAE 
4 861 GCATTTGGAACCACAGTCC ATGGTTTGGCTCCCAGTTTTACATCGAGTGGCAGC AGCGG A 4 920 

TAKHQAKCNICKECPIVCFR 
4921 GACTGCAAAACATCAGGCCAAATGCAACATCTGTAAAGAATGTCCAATTGTCGGGTTCAG 4 980 
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YRSLKHFNYDVCQSCFFSGR 
4 9 61 CTATAGAAGCCTTAAGCATTTTAACTATGATGTCTGCCAGAGTTGTTTCTTTTCGGGTCG ' 504 0 

TAKGHKLHYPMVEYCI PTTS 
50 41 AACAGCAAAAGGTCACAAATTACATTACCCAATGGTGGAATATTGTATACCTACAACATC 

GEDVRDFTKVLKNKFRSKKY 
5101 TGGGGAAGATGTACGAG ACTTCACAAAGGTACTTAAGAACAAGTTCAGGTCGAAGAAGTA 

FAKHPRLGYLPVQTVLEGDN 
5161 CTTTGCCAAACACCCTCGACTTGGTTACCTGCCTGTCCAGACAGTTCTTGAAGGTGACAA 

LETPITLISMWPEHYDPSQS 
CTTAGAGACTCCTATCACACTCATCAGTATGTGGCCAGAGCACTATGACCCCTCACAATC 5280 

PQLFHDDTHSRI EQYATRLA 
TCCTCAACTGTTTCATGATGACACCCATTCAAGAATAGAACAATATGCCACACGACTGGC 534 0 

QKERTNGSFLTDSSSTTGSV 
53 41 CCAGATGGAAAGGACTAATGGGTCTTTTCTCACTGATAGCAGCTCCACCACAGGAAGTGT 5 400 

EDEHALIQQYCQTLGGESPV 
GG AAGACGAGCACGCCGTCATCCAGCAGTATTGCCAAACACTCGGAGGAGAGTCCCCAGT 5460 

S QPQS PAQI LKSVEREERGE 
GAGCCAGCCGCAGAGCCCAGCTCAGATCCTGAAGTCAGTAGAGAGGGAAGAACGTGGAGA 5520 

L ERI I ADLEEEQRNLQV EYE 
ACTGG AGAGGATCATTGCTGACCTGGAGGAAGAACAAAGAAATCTAC AGGTGGAGTATGA 558 0 

Q LKDQHLRRGLPVGSP PES I 
5 5 B 1 GCAGCTGAAGGACCAGCACCTCCGAAGGGGGGTCGCTGTCGGTTCACCGCCAGAGTCGAT 564 0 

ISPHHTSEDSELIAEAKLLR 

56 41 TATATCTCCCCATCACACGTCTGAGGATTCAGAACTTATAGCAGAAGCAAAACTCGTCAG 5700 

OHKGRLEARMQI LEDHNKQL 
5701 GCAGCACAAAGGTCGGCTGGAGGCTAGGATGCAGATTTTAGAAGATCACAATAAACAGCT 5760 

ESQLHRLRQLLEQPESDSRI 

57 61 GGAGTCTCAGCTCCACCGCCTCCGACAGCTGCTGGAGCAGCCTGAATCTGATTCCCGAAT 5 B2 0 

N,.G V S PWAS PQH S ALS Y SLD p" 

58 21 CAATGGTGTTTCCCCATGGGCTTCTCCTCAGCATTCTGCACTGAGCTACTCGCTTGATCC 

DASGPQFHQAAGEDLLAPPK 
5 881 AGATGCCTCCGGCCCACAGTTCCACCAGGCAGCGGGAGAGGACCTGCTGGCCCCACCGCA 

DTSTDLTEVMEQIHSTFPSC 
5941 CGACACCAGC ACGGATCTCACGGAGGTCATGGAGCAGATTCACAGCACGTTTCCATGTTG 

CPNVPSRPQAM* 
6001 CTGCCCAAATGTTCCCAGCAGGCCACAGGCAATGTAATCACTAGT 604 5 
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Figure 8 
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Figure 9 



MAKYGEKEASP DNGQNE 
1 ACTAGTCAAGATGGCCAAGTATGGAGAACATGAAGCCAGTCCTGACAATGGGCAGAACGA 60 

FSDIIESRSDEHNDVQKKTF 
61 ATTCAGTGACATCATTGAGTCCAGATCTGATGAACACAATGATGTACAGAAGAAAACCTT 12 0 

TKWINARFSKSGKPPISDM-F 
121 TACCAAATGGATAAACGCTCGATTTTCCAAGAGTGGGAAACCACCCATCAGTGATATGTT 180 

SDLKDGRKLLDLLEGLTGTS 
181 CTCAGACCTCAAAGATGGGAGAAAGCTCTTGGATCTTCTCGAAGGCCTCACAGGAACATC 24 0 

LPKERGSTRVHALNNVNRVL 
241 ATTGCCAAAGGAACGTGGTTCCACAAGGGTGCATGCCTTAAACAATGTCAACCGAGTGCT 300 

QVLHQNNVDLVNIGGTDIVD 
3 01 ACAGGTTTTACATCAGAACAATGTGGACTTGGTGAATATTGGAGGC ACGGACATTGTGGA 3 60 

GNPKLTLGLLWSIILHWQVK 
361 TGGAAATCCCAAGCTGACTTTAGGGTTACTCTGGAGCATCATTCTGCACTGGCAGGTGAA 420 

DVMKDI MSDLQQTNSEKILL 
421 GGATGTCATGAAAGATATCATGTCAGACCTGCAGCAGACAAACAGCGAGAAGATCCTGCT 480 

SWVRQTTRPYSQVNVLNFTT 
481 GAGCTGGGTGCGGCAGACCACCAGGCCCTACAGTCAAGTCAACGTCCTCAACTTCACCAC 540 

SWTDGLAFNAVLHRHKPDLF 
541 CAGCTGGACCGATGGACTCGCGTTCAACGCCGTGCTCCACCGGCACAAACCAGATCTCTT 6 00 

SWDRVVKMSPI ERLEHAFSK 
601 CAGCTGGGACAGAGTGGTCAAAATGTCCCCAATTGAGAGACTTGAACATGCTTTTAGCAA 6 60 

AHTYLG I EKLLDPEDVAVHL 
6 61 GGCCCAC ACTTATTTGGGAATTGAAAAGCTTCTAGATCCTGAAGATGTTGCTGTGC ATCT 7 20 

PXXXXXXXXXXXXVEVLPQQ 

721 cccNNN^I^m^^w^]NNN^^a^wN^^>n^^I^^ 7 go 

VTIDAI REVET LPRK YKKE C 
781 AGTCACGATAGATGCCATCCGAGAGGTGGAGACTCTCCCAAGGAAGTATAAGAAAGAATG 840 

EEEEI HI QSAVLAEEGQSPR 

8 41 TGAAGAGGAAGAAATTCATATCCAGAGTGCAGTGCTGGCAGAGGAAGGCCAGAGTCCCCG 900 

AETPSTVTEVDMDLDSYQIA 

9 01 AGCTGAGACCCCTAGCACCGTCACTGAAGTGGACATGGATTTGGACAGCTACCAGATAGC 9 60 

LEEVLTWLLSAEDTFQEQDD 
9 6 1 GCTAGAGGAAGTGCTGACGTGGCTGCTGTCCGCGGAGGACACGTTCCAGGAGCAAGATGA 102 0 

I SDDVEEVKEQFATHETFMM 
1021 CATTTCTGATGATGTCGAAGAAGTCAAAGAGCAGTTTGCTACCCATGAAACTTTTATGAT 10 80 

ELTAHQS SVGSVLQAGNQLM 
1081 GGAGCTGACAGCACACCAGAGCAGCGTGGGGAGCGTCCTGCAGGCTGGCAACCAGCTGAT 114 0 

TQGTLSEEEEFEIQEQMTLL 
1141 CACACAAGGGACTCTGTCAGAGGAGGAGGAGTTTGAGATCCAGGAACAGATGACCTTGCT 1200 
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Figure 9 Continued 

NARWEALRVESMERQSRLHD 
1201 GAATGCAAGGTGGGAGGCGCTCCGCGTGGAGAGCATGGAGAGGC AGTCCCGGCTGCACGA 1260 

ALKELQKKQLQQLSSWLALT 
12 61 CGCTCTGATGGAGCTGCAGAAGAAACAGCTGCAGCAGCTCTCAAGCTGGGTGGCCCTCAC J.3 20 

EERIQKMESPPLGDDLPSLQ 
1321 AGAAGAGCGCATTCAGAAGATGGAGAGCCCTCCGCTGGGTGATGACCTGCCCTCCCTGCA 1380 

KLLQEHKSLQNDLEAEQVKV 
1381 GAAGCTGCTTCAAGAACATAAAAGTTTGCAAAATGACCrrTGAAGCTGAACAGGTGAAGGT , 1440 

NSLTHMVVIVDENSGESATA 
1441 AAATTCCTTAACTCACATGGTGGTGATTGTGGATGAAAACAGTGGGGAGAGTGCCACAGC 1500 

LLEDQLQKLGERWTAVCRWT 
IS 01 TCTTCTGGAAGATCAGTTACAGAAACTGGGTGAGCGCTGGACAGCTGTATGCCGCTGGAC 1560 

EERWNRLQEISILWQELLEE 
1561 TGAAGAACGTTGGAACAGGTTGCAAGAAATCAGTATTCTGTGGCAGGAATTATTGGAAGA 1620 

QCLLEAWLTEKEEALNKVQT 
1621 GCAGTGTCTGTTGGAGGCTTGGCTCACCGAAAAGGAAGAGGCTTTGAATAAAGTTCAAAC 1680 

SNFKDQKELSVSVRRLAILK 
1681 CAGCAACTTTAAAGACCAGAAGGAACTAAGTGTCAGTGTCCGGCGTCTGGCTATATTGAA 17 40 

EDMEMKRQTLDQLS EIGQDV 
1741 GGAAGACATGGAAATGAAGAGGCAGACTCTGGATCAACTGAGTGAGATTGGCCAGGATGT 1800 

GQLLSNPKASKKMNSDSEEL 
IB 01 GGGCCAATTACTCAGTAATCCCAAGGCATCTAAGAAGATGAACAGTGACTCTGAGGAGCT 1860 

TQRWDS LVQRLEDS SNQVTQ 
1 B 6 1 AAC ACAGAG ATGGG ATTCTCTGGTTC AGAGACTCGAAG ACTCTTCTAACCAGGTGACTC A 1920 

AVAKLGMSQIPQKDLLETVH 
1921 GGCGGTAGCGAAGCTCGGCATGTCCCAGATTCCACAGAAGGACCTATTGGAGACCGTTCA 1980 

VREKGMVKKPKQELPPPLGP 
19 Bl TGTGAGAGAAAAAGGGATGGTGAAGAAGCCCAAGCAGGAACTGCCTCCTCCGTTGGGCCC 2040 

KKRQIHVDIEAKKKFDAISA 
2041 AAAGAAGAGACAGATCCATGTGGATATTGAAGCTAAGAAAAAGTTTGATGCTATAAGTGC 2100 

EL, LNWILKWKTAIQTTEIKE 
2101 AG AGCTGTTGAACTGG ATTTTGAAATGGAAAACTGCCATTCAGACCACAGAGATAAAAG A 2160 

YMKMQDTSEMKKKLKALEKE 
2161 GTATATGAAGATGCAAGACACTTCCGAAATGAAAAAGAAGTTGAAGGCATTAGAAAAAGA 2220 

ORERI PRADELNQTGQI LVE 
2221 ACAGAGAGAAAGAATCCCCAGAGCAGATGAATTAAACCAAACTGGACAAATCCTTGTGGA 2280 

OKGKEGLPT EEIKNVLEKVS 
2281 GCAAATGGGAAAAGAAGGCCTTCCTACTGAAGAAATAAAAAATGTTCTGGAGAAGGTTTC 2340 

S EWKNVSQHLEDLERKIQLQ 
2 341 ATCAGAATGGAAGAATGTATCTCAACATTTGGAAGATCTAGAAAGAAAGATTCAGCTACA 2400 

EDINAYFK. QLDELEKVIKTK 
2 401 GGAAGATATAAATGCTTATTTCAAGCAGCTTGATGAGCTTGAAAAGGTCATCAAGACAAA 2460 
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Figure 9 Continued 

EEWVKHTSISESSRQSLPSL 
2461 GGAGGAGTGGGTAAAACACACTTCCATTTCTGAATCTTCCCGGCAGTCCTTGCCAAGCTT 2520 

KDSCQRELTNLLGLKPKIEM 
2 521 GAAGGATTCCTGTCAGCGGGAATTGACAAATCrrTCTTGGCCTTCACCCCAAAATTGAAAT ■ ■ 25 80 

ARASCSALMSQPSAPDFVC/R 
2 581 GGCTCGTGCAAGCTGCTCGGCCCTGATGTCTCAGCCTTCTGCCCCAGATTTTGTCCAGCG 2 6 40 

GFDSF LGRYQAVQEAVEDRQ 

2 641 GGGCTTCGATAGCTTTCTGGGCCGCTACCAAGCTGTACAAGAGGCTGTAGAGGATCGTCA 27 00 

QHLENELKGQPGHAYLETLK 
2701 ACAACATCTAGAGAATGAACTK3AAGGGCCAACCTGGACATGCATATCTGGAAACATTGAA 2760 

TLKDVLNDSENKAQVSLNVL 
2761 AACACTGAAAGATGTGCTAAATGATTCAGAAAATAAGGCCCAGGTGTCTCTGAATGTCCT 2820 

NDLAKVEKALQEK.KTLDEIL 
2821 TAATGATCTTGCCAAGGTGGAGAAGGCCCTGCAAGAAAAAAAGACCCTTGATGAAATCCT 2880 

ENQKPAL.HKLAEETKALEKN 
2881 TGAGAATCAGAAACCTGCATTACATAJLACTTGCAGAAGAAJiCAAAGGC^CTGGAGAAAAA 2940 

VHPDVEKLYKQ EFDDVQGKW 
2941 TGTTCATCCTGATGTAGAAAAATTATATAAGCAAGAATTTGATGATGTGCAAGGAAAGTG 3000 

NKLKVLVSKDLHLLEEIALT 
3D 01 GAACAAGCTAAAGGTCTTGGTTTCCAAAGATCTACATTTGCrrTGAGG^ 3060 

LRAFEADSTVI EKWMDGVKD 
3061 ACTCAGAGCTTTTGAGGCCGATTCAACAGTCATTGAGAAGTGGATGGATGGCGTGAAAGA 3120 

FLMKQQAAQGDDAGLQRQLD 
3121 CTTCTTAATGAAACAGCAGGCTGCCCAAGGAGACGACGCAGGTCTACAGAGGCAGTTAGA 3180 

QCSAFVNEI ETIESSLKNMK 
3181 CCAGTGCTCTGCATTTGTTAATGAAATAGAAACAATTGAATCATCTCTGAAAAACATGAA 3240 

E I ETN L R SG PVAGIKTWV QT 
3241 GGAAATAGAGACTAATC^CGAAGTGGTCCAGTTGCrraGAATAAAAACTTGGGTGCAGAC 33 00 

RLGDYQTQLEKLSKEIATQK 
3301 AAGACTAGGTGACTACCAAACTCAACTGGAG AAACTTAGCAAGGAG ATCGCTACTCAAAA 33 60 

SRLSESQEKAANLKKDLAEM 

33 61 AAGTAGGTTGTCTGAAAGTCAAGAAAAAGCTGCGAACCTGAAGAAAGACTTGGCAGAGAT 342 0 

QEWKTQAEEEYLERDFEYKS 
3421 GCAGGAATGGATGACCCAGGCCGAGGAAGAATATTTGGAGCGGGATTTTGAGTACAAGTC 3480 

PEELESAVEEMKRAKEDVjLQ 

34 81 ACC AG AAG AGCTTGAGAGTGCTGTGGAAG AGATGAAGAGGGCAAAAGAGGATGTGTTGCA 3 54 0 

KEVRVKILKDNIKLLAAKVP 
3541 GAAGGAGGTGAGAGTGAAGATTCTCAAGGACAACATCAAGTTATTAGCTGCCAAGGTGCC 3 600 

SGGQELTS ELNVVLENYQLL 

3 601 CTCTGGTGGCCAGGAGTTGACGT'CTGAGCTGAATGTTGTGCTGGAGAATTACCAACTTCT 3 660 



CNRIRGKCHTLEEVWSCWIE 
3 661 TTGTAATAGAATTCGAGGAAAGTGCCACACGCTAgagGAGGTCTGGTCTTGTTGGATTGA 3720 
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Figure 9 Continued 

LLHYLDLETTWLNTLEERMK 
3721 ACTGCTTCACTATTTGGATCTTGAAACTACCTGGTTAAACACTTTGGAAGAGCGGATGAA 3780 

STEVLPEKTDAVNEALESLE 
37 81 GAGCACAGAGGTCCTGCCTGAGAAGACGGATGCTGTCAACGAAGGCCTGGAGTCTCTGGA .. 3 840 

SVLRHPADNRTQIRELGQTL 
3841 ATCTGTTCrrGCGCCACCCGGCAGATAATCGCACCCAGATTCGAGAGCTTGGCCAGACTCT 3900 

IDGGILDDIISEKLEAFNSR 
3901 GATTGATGGGGGGATCCTGGATGATATAATCAGTGAGAAACTGGAGGCTTTCAACAGCCG 3960 

YEDLSHLAESKQISLEKQLQ 
39 61 ATATGAAGATCTAAGTCACCTGGCAGAGAGCAAGCAGATTTCTTTGG AAAAGCAACTCCA 4 020 

VLRETDQMLQVLOESLGELD 
4021 GGTGCTGCGGGAAACTGACCAGATGCTOCAAGTCTTGCAAGAGAGCTTGGGGGAGCTGGA 4 080 

KQLTTYLTDRI DA FQVPQEA 
4081 CAAAC AGCTCACCACATACCTG ACTGACAGG ATAGATGCTTTCCAAGTTCCACAGGAAGC 4140 

QKIQAEI SAHELTLEELRRN 
4141 TCAGAAAATCCAAGCAGAGATCTCAGCCCATGAGCTAACCCTAGAGGAGTTGAGAAGAAA 4200 

MRSQ PLTS PESRTARGGSQM 
4201 TATGCGTTCTCAGCCCCTGACCTCCCCAGAGAGTAGGACTGCCAGAGGAGGAAGTCAGAT 4260 

DVLQRKLREVSTKFQLFQKP 
4261 GGATGTGCTACAGAGGAAACTCCGAGAGGTGTCCACAAAGTTCCAGCTTTTCCAGAAGCC 43 20 

ANFEQRMLDCKRVLDGVKAE 
4321 AGCTAACTTCGAGCAGCGCATGCTGGACTGCAAGCGTGTGCTGGATGGCGTGAAAGCAGA 4380 

LHVLDVKDVDPDVIQTHLDK 
4381 ACTTCACGTTCTGGATGTGAAGGACGTAGACCCTGACGTCATACAGACGCACCTGGACAA 4440 

CMKLYKTLSEVKLEVETVI K 
4 441 GTGTATGAAACTGTATAAAACrrrGAGTGAAGTCAAACTTGAAGTGGAAACTGTGATTAA 4500 

TGRHIVQKQQTDNPKGMDEQ 
4501 AACAGGAAGACATATTGTCCAGAAACAGCAAACGGACAACCCAAAAGGGATGGATGAGCA 4560 

LTSLKVLYNDLGAQVTEGKQ 

45 61 GCTGACTTCCCTGAAGGTTCTTTACAATGACCTGGGCGCACAGGTG ACAGAAGG AAAACA 4 620 

D LERASQLAR KMKKEAASLS 
4 621 GGATCTGGAAAGAGCATCACAGTTGGCCCGGAAAATGAAGAAAGAGGCTGCTTCTCTCTC 4680 

EWLSATETELVQKSTSEGLL 

46 81 TGAATGGCTTTCTGCTACTGAAACTGAATTGGTACAGAAGTCCACTTCAGAAGGTCTGCT 4 74 0 

GDLDTEI SWAKNVLKDLEKR 

47 41 TGGTGACTTGGATACAGAAATTTCCTGGGCTAAAAATGTTCTGAAGGATCrrcGAAAAGAG 4 800 

KADLNTITESSAALQNLIEG 
4801 AAAAGCTGATTT AAATACCATC AC AGAGAGTAGTGCTGCCCTGCAAAACTTGATTGAGGG 4 860 

S EPI LEER LCVLNAGWSRVR 
4 8 61 CAGTGAGCCTATTTTAGAAGAGAGGCTCTGCGTCCTTAACGCTGGGTGGAGCCGAGTTCG 4 920 

TWTEDWCNTLMNHQNQLETF 
4921 TACCTGGACTGAAGATTGGTGCAATACCTTGATGAACCATCAG AACCAGCTAGAAATATT 4980 
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dgnvahistwlyoaeall.de 
49 81 tgatgggaacgtggctcacataagtacctggctttatcaagctgaagctctattggatga 5040 

iekkptskqeeivkrlvsel 
aattgaaaagaaaccaacaagtaaacaggaagaaattgtg aagcgtttagtatctgagct 5100 

ddanlqvenvrdqali lmna 
ggatgatgccaacctccaggttgaaaatgtccgcgatcaagcccttattttgatgaatgc 5160 

rgsssrelvepklae lnrnf 
ccgtggaagctcaagcagggagcttgtagaaccaaagttagctgagctgaatag^iaagtt 5220 

ekvsqhiksaklliaqeply 
tgaaaaggtgtctcaacatatcaaaagtgccaaattgctaattgctcaggaaccattata 5280 

qclvttetfetgvpfsdlek 
ccaatgtttggtcaccactgaaacatttgaaactggtgtgcctttctctgacttggaaaa 5340 

lendienmlkfvekhlessd 

ATTAGAAAATGACATAGAAAATATGTTAAAATTTGTGGAAAAACACTTGGAATCCAGTGA 5400 

EDEKKDEESAQI EEVLQRGE 
TGAAGATGAAAAGATGGATGAGGAGAGTGCCCAGATTGAGGAAGTTCTACAAAGAGGAGA 5460 

EMLHQPMEDNKKEKIRLQLL 
AGAAATGTTACATCAACCTATGGAAGATAATAAAAAAGAAAAGATCCGTTTGCAATTATT 5520 

LLHTRYNKIKAIPIQQRKMG 
ACTTTTGCATACTAGATACAACAAAATTAAGGCAATCCCTATTCAACAGAGGAAAATGGG 5580 

QLASGIRSSLLPTDYLVEIN 
TCAACTTGCTTCTGGAATTAGATCJVTCACTTCTTCCTACAGATTATCTGGTTGAAATTAA 5640 

KI LLCMDDVELS LNVP ELNT 
CAAAATTTTACTTTGCATGGATGATGTTGAATTATCGCTTAATGTTCCAGAGCTCAACAC 5700 

AIYEDFSFQEDSLKNI KDQL 
57 01 TGCTATTTACGAAGACTTCTCTTTTCAGGAAGACTCTCTGAAGAATATCAAAGACCAACT 5760 

DKLGEO IAVIHEKQPDVILE 
5761 GGACAAACTTGGAGAGCAGATTGCAGTCATTCATGAAAAACAGCCAG ATGTCATCCTTGA 5820 

ASGPEAIQIRDTLTQLNAKW 
5821 AGCCTCTGGACCTGAAGCCATTCAGATCAGAGATACACTTACTCAGCTGAATGCAAAATG 5880 

DRINRMYSDRKGCFDRAMEE 
5 6 81 GGACAGAATTAATAGAATGTACAGTGATCGGAAAGGTTGTTTTGACAGGGC AATGGAAGA 594 0 

WRQFHCDLNDLTQWITEAEE 
5941 ATGGAGACAGTTCCATTGTGACCTTAATGACCTCACACAGTGGATAACAGAGGCTGAAGA 6000 

LLVDTCAPGGSLDLEKARIH 
6001 ATTACTGGTTGATACCTGTGCTCCAGGTGGCAGCCTGGACTTAGAGAAAGCCAGGATACA 6060 

QQELEVGISSHQPSFAALNR 
6061 TCAGCAGGAACTTGAGGTGGGCATCAGCAGCCACCAGCCCAGTTTTGCAGCACTAAACCG 6120 

TGDGIVQKLSQADGSFLKEK 
6121 AACTGGGGATGGGATTGTGCAGAAACTCTCCCAGGCAGATGGAAGCTTCTTGAAAGAAAA 6180 

LAGLNQRWDA1VAEVKDRQP 
6181 ACTGGCAGGTTTAAACCAACGCTGGGATGCAATTGTTGCAGAAGTGAAGGATAGGCAGCC 624 0 



5041 
5101 
5161 
5221 
5281 



5341 
5401 
5461 
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5641 
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RLKGESKQVMKYRHQLDEI I 
6241 AAGGCTAAAAGGAGAAAGTAAGCAGGTGATGAAGTACAGGCATCAGCrTAGATGAGATTAT 6 300 

CWLTKAEHAMQKR STTELGE 
6301 CTGTTGGTTAACAAAGGCTGAGCATGCTATGCAAAAGAGATCAACCACCGAATTGGGAGA 6360 

NLOELRDLTQEKEVHAEKLK 
6361 AAACCTGCAAGAATTAAGAGACTTAACTCAAGAAATGGAAGTACATGCTGAAAAACTCAA 6420 

WLNRTELEMLSDKSLSLPER 
6421 ATGGCTGAATAGAACTGAATTGGAGATGCTTTGAGATAAAAGTCTGAGTTTACCTGAAAG 64 80 

DKISESLRTVNMTWNKICRE 
64 Bl GGATAAAATTTCAGAAAGCTTAAGGACTGTAAATATGACATGGAATAAGATTTGCAGAGA 6540 

VPTTLKECIQEPSSVSQTRI 
6541 GGTGCCTACCACCCTGAAGGAATGCATCCAGGAGCCCAGTTCTGTTTCACAGACAAGGAT 6600 

AAHPNVQKVVLVSSASDI PV 
6601 TGCTGCTCATCCTAATGTCCAAAAGGTGGTGCTAGTATCATCTGCGTCAGATATTCCTGT 6660 

QSHRTSEI SIPADLDKTITE 
6661 TCAGTCTCATCGTACTTCGGAAATTTCAATTCCTGCTGATCTTGATAAAACTATAACAGA 6720 

LADWLVLI DQMLKSNIVTVG 
6721 ACTAGCCGACTGGCTGGTATTAATCGACC^GATGCTGAAGTCCAACATTGTCACTGTTGG 6780 

DVEEINKTVSRMKITKADLE 
6781 GG ATGTAGAAGAGATCAATAAG ACCGTTTCCCGAATGAAAATTACAAAGGCTGACTTAGA 6840 

QRHPQLDYVFTLAQNLKNKA 
6841 ACAGCGCCATCCTCAGCTGGATTATGTTTTTACATTGGCACAGAATTTGAAAAATAAAGC 6900 

S SSDMRTA ITEKLERVKNQW 
6901 TTCCAGTTCAGATATGAGAACAGCAATTACAGAAAAATTGGAAAGGGTCAAGAACCAGTG 6960 

DGTQHGVELRQQQLEDMIID 
69 61 GGATGGCACCCAGCATGGCGTTGAGCTAAGACAGCAGCAGCTTGAGGACATGATTATTGA 7020 

SLQWDDHREETEELMRKYEA 
7 021 CAGTCTTCAGTGGGATGACCATAGGGAGGAGACTGAAGAACTGATGAGAAAATATGAGGC 7080 

RLYILQQARRDPLTKQISDN 
7081 TCGACTCTATATTCTTCAGCAAGCCCGACGGGATCCACTCACCAAACAAATTTCTGATAA 714 0 

QILLQELG PGDGI VMAFDNV 
7141 CCAAATACTGCTTCAAGAACTGGGTCCTGGAGATGGTATCGTCATGGCGTTCGATAACGT 72 00 

LQKLLEEYGSDDTRNVKETT 
72 01 CCTGCAG AAACTCCTGG AGG AATATGGG AGTGATG ACACAAGGAATGTGAAAGAAACCAC 7260 

EYLKTSW1 NLKQSIADRQNA 
7261 AGAGTACTTAAAAACATCATGGATCAATCTCAAACAAAGTATTGCTGACAGACAGAACGC 732 0 

LEAEWRTVQASRRDLENFLK 
7321 CTTGGAGGCTGAGTGGAGGACGGTGCAGGCCTCTCGCAGAGATCTGGAAAACTTCCTGAA 7380 

WIOEAETTVNVLVDASHREN 
7 381 GTGGATCCAAGAAGCAGAGACCACAGTGAATGTGCTTGTGGATGCCTCTCATCGGGAGAA 7440 

ALODSILARELKQQMQDIQA 
7441 TGCTCTTCAGGATAGTATCTTGGCCAGGGAACTCAAACAGCAGATGCAGGACATCCAGGC 7500 
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EIDAHNDIFKSIDGN RQKMV 
7501 AGAAATTGATGCCCACAATGACATATTTAAAAGCATTGACGGAAACAGGCAGAAGATGGT 7560 

KALGNSEEATMLQHRLDDMN 
7 561 AAAAGCTTTGGGAAATTCTGAAGAGGCTACTATGCTTCAACATCGACTGGATGATATGAA , 762 0 

QRWNDLKAKSAS I RAHLEAS 
7621 CCAAAGATGGAATGACTTAAAAGCAAAATCTGCTAGCATCAGGGCCCATTTGGAGGCCAG 7680 

AEKWNRLLMSLEELIKWLNM 
7681 CGCTGAGAAGTGGAACAGGTTGCTG ATGTCCTTAGAAGAACTGATCAAATGGCTGAATAT 77 40 

KDEELKKQMPIGGDVPALQ L 
7741 GAAAGATGAAGAGCTTAAGAAACyUATGCCTATTGGAGGAGATGTTCCAGCCTTACAGCT 7800 

QYDHCKALRRELKEKEYSVL 
7B01 CCAGTATGACCATTGTAAGGCCCTGAGACGGGAGTTAAAGGAGAAAGAATATTCTGTCCT 7860 

NAVDQARVF LADQ P I EAPEE 
7861 GAATGCTGTCGACCAGGCCCGAGTTTTCTTGGCTGATCAGCCAATTGAGGCCCCTGAAGA 7920 

PRRNLQSKTE XTPEERAOKI 
7 921 GCCAAGAAGAAACCTACAATCAAAAACAGAATTAACTCCTGAGGAGAGAGCCCAAAAGAT 7980 

AKAMRKQSSEVKEKWESLNA 
7981 TGCCAAAGCCATGCGC AAACAGTCTTCTGAAGTCAAAGAAAAATGGGAAAGTCTAAATGC 804 0 

VTSNWOKQVDKALEKLRDLO 
8041 TGTAACTAGCAATTGGCAAAAGCAAGTGGACAAGGCATTGGAGAAACTCAGAGACCTGCA 8100 

GAMDDLDADMKEAESVRNGW 
8101 GGGAGCTATGGATGACCTGGACGCTGACATGAAGGAGGCAGAGTCCGTGCGGAATGGCTG 8160 

KPVGDLLIDSLQDHIEKIMA 
6161 GAAGCCCGTGGGAGACTTACTCATTGACTCGCTGCAGGATCACATTGAAAAAATCATGGC 8220 

FREEIAPINFKVKTVNDLSS 
8221 ATTTAGAGAAGAAATTGCACCAATCAACTTTAAAGTTAAAACGGTGAATGATTTATCCAG 8 2 B0 

QLSPLDLHPSLKMSRQLDDL 
82 81 TCAGCTGTCTCCACTTGACCTGCATCCCTCTCTAAAGATGTCTCGCCAGCTAGATGACCT 8340 

NMRWKLLQVSVDDRLKQLQE 
8341 TAATATGCGATGGAAACTTTTACAGGTTTCTGTGGATGATCGCCTTAAACAGCTTCAGGA 8400 

A HRDFGPSSQHFLSTSVQLP 
84 01 AGCCCACAGAGATTTTGGACCATCCTCTCAGCATTTTCTCTCTACGTCAGTCCAGCTGCC 8 460 

WQRSISHNKVPYYINHQTQT 
GTGGCAAAGATCCATTTCACATAATAAAGTGCCCTATTACATCAACCATCAAACACAGAC 8 52 0 

TCWDHPKMTELFQSLADLNN 
CACCTGTTGGGACCATCCTAAAATGACCGAACTCTTTCAATCCCTTGCTGACCTGAATAA 8 5 B 0 

VRFSAYRTA I K I RRLQKALC 
TGTACGTTTTTCTGCCTACCGTACAGC AATCAAAATCCG AAGACTAC AAAAAGCACTATG 864 0 

LDLLELSTTNEI FKQHKLNQ 
86 41 TTTGGATCTCTTAGAGTTGAGTACAACAAATGAAATTTTCAAACAGCACAAGTTGAACCA 87 00 

NDQLLSVPDV INCLTTTYDG 
8701 AAATGACCAGCTCCTCAGTGTTCCAGATGTCATCAACTGTCTGACAACAACTTATGATGG 87 60 
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LEQMHKDLVNVPLCVDMCLN 
ACTTGAGCAAATGCATAAGGACCTGGTCAACGTTCCACTCTGTGTTGATATGTGTCTCAA 8820 

WLLNVYDTGRTCKIRVQSLK 
TTGGTTGCTCAATGTCTATGACACGGGTCGAACTGGAAAAATTAGAGTGCAGAGTCTGAA 8880 

IGLMSLSKGLLEEKYRYLFK 
G ATTGGATTAATGTCTCTCTCCAAAGGTCTCTTGGAAG AAAAATACAG ATATCTCTTTAA 8940 

EVAGPTEMCDQRQLGLLLHD 
GGAAGTTGCGGGGCCGACAGAAATGTGTGACCAGAGGCAGCTGGGCCTGTTACTTCATGA 9000 

AIQIPRQLGEVAAFCGSNIE 
TGCCATCCAGATCCCCCGGCAGCTAGGTGAAGTAGCAGCTTTTGGAGGCAGTAATATTGA 9060 

PSVRSCFQQNNNKPEISVKE 
GCCTAGTGTTCGCAGCTGCTTCCAACAGAATAACAATAAACCAGAAATAAGTGTGAAAGA 9120 

FIDWMHLEPQSMVWLPVLHR 
GTTTATAGATTGGATGCATTTGGAACC ACAGTCCATGGTTTGGCTCCCAGTTTTACATCG 9180 

VAAAETAKHQAKCNICKECP 
AGTGGCAGCAGCGGAGACTGCAAAAC ATCAGGCCAAATGCAACATCTGTAAAG AATGTCC 9240 

IVGFRYRSLKHFNYDVCQSC 
AATTGTCGGGTTCAGGTATAGAAGCCTTAAGCATTTTAACTATGATGTCTGCCAGAGTTG 9 300 

FFSGRTAKGHKLHYPMVEYC 
TTTCTTTTCGGGTCGAAC AGCAAAAGGTCACAAATTACATTACCCAATGGTGGAATATTG 93 60 

I PTTSGEDVRDFTKVLKNKF 
TATACCTACAACATCTGGGGAAGATGTACGAGACTTCACAAAGGTACTTAAGAACAAGTT 9420 

RSKKYFAKHPRLGYLPVQTV 
9421 CAGGTCGAAGAAGTACTTTGCCAAACACCCTCGACTTGGTTACCTGCCTGTCCAGACAGT 9480 

LEGDNLETPITLISMWPEHY 
94 81 TCTTGAAGGTGACAACTTAGAGACTCCTATCACACTCATCAGTATGTGGCCAGAGCACTA 9540 

DPSQSPQLFHDDTHSRIEQY 
9 541 TGACCCCTCACAATCTCCTCAACTGTTTCATGATGACACCCATTCAAGAATAGAACAATA 9600 

ATRLAQMERTNGSFLTDSSS 
9601 TGCCACACGACTGGCCCAGATGGAAAGGACTAATGGGTCTTTTCTCACTGATAGCAGCTC 9660 

T TGSVE D EHA LI QQYCQTLG 
9 661 CACCACAGGAAGTGTGGAAGACGAGCACGCCCTCATCCAGCAGTATTGCCAAACACTCGG 9720 

GESPVSQPQSPAOILKSVER 
97 21 AGGAGAGTCCCCAGTGAGCCAGCCGCAGAGCCCAGCTCAGATCCTGAAGTCAGTAGAGAG 97 80 

EERGELERI I ADLEEEQRNL 
97 81 GGAAGAACGTGGAGAACTGGAGAGGATCATTGCTG ACCTGGAGGAAGAAC AAAGAAATCT 9840 

QVEYEQ LKDQHLRRGLPVGS 
9 841 ACAGGTGGAGTATGAGCAGCTGAAGGACCAGCACCTCCGAAGGGGGCTCCCTGTCGGTTC 99 00 

PPESII SPHHTSEDSELIAE 
9 9 01 ACCGCCAGAGTCGATTATATCTCCCCATCACACGTCTGAGGATTCAGAACTTATAGCAGA 99 60 

AKLLRQHKGRLEARMQILED 
9 9 61 AGCAAAACTCCTCAGGCAGCACAAAGGTCGGCTGGAGGCTAGGATGCAGATTTTAGAAGA 1002 0 



WO 97/22696 



PCT7GB96/03156 



22/23 



Figure 9 Continued 



HNKQLESQLHRLRQLLEQPE 
10021 TCACAATAAACAGCTGGAGTCTCAGCTCCACCGCCTCCGACAGCTGCTGGAGCAGCCTGA 10080 

SDSRINGVSPWASPQHSALS 
1 0 0 B 1 ATCTGATTCCCGAATCAATGGTGTTTCCCCATGGGCTTCTCCTCAGCATTCTGCACTGAG 1014 0 

YSLDPDASGPQFHQAAGED.L 
10141 CTACTCGCTTGATCCAGATGCCTCCGGCCCACAGTTCCACCAGGCAGCGGGAGAGGACCT 10200 

LAPPHDTSTDLTEVMEQIHS 
10201 GCTGGCCCCACCGCACGACACCAGCACGGATCTCACGGAGGTCATGGAGCAGATTCAGAG 102 60 

TFPSCC PNVPSRPQAM* 
10261 CACGTTTCCATCTTGCTGCCCAAATGTTCCCAGCAGGCCACAGGCAATGTAATCACTACT 10320 
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